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Preface 


This document constitutes the final report describing work conducted under Grant NAG 
2 - 665. The work has been directed at the development of efficient multigrid methods for 
the solution of aerodynamic problems involving complex geometries, including the devel- 
opment of computational methods for the solution of both inviscid and viscous transonic 
flow problems. The emphasis is on problems of complex, three-dimensional geometry. The 
methods developed are based upon finite-volume approximations to both the Euler and 
the Reynolds-Averaged Navier-Stokes equations. The methods are developed for use on 
multi-block grids using diagonalized implicit multigrid methods to achieve computational 
efficiency. The work is focused upon aerodynamic problems involving complex geometries, 
including advanced engine inlets. 

The Principal Investigator for the project has been David A. Caughey, Professor and 
Director of the Sibley School of Mechanical and Aerospace Engineering at Cornell University. 
The Grant Technical Monitor has been 


Dr. W. J. Chyu 

Applied Aerodynamics Branch 

Mail Stop 227-6 

NASA Ames Research Center 

Tel.: (415) - 604-6208 


Further information regarding this work can be obtained from: 


Professor David A. Caughey 
105 Upson Hall 
Cornell University 
Ithaca, New York 14853 
Tel: (607) - 255-3372 
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I. Review of Work 


Work has been directed towards the efficient solution of the inviscid Euler and Reynolds- 
averaged Navier-Stokes equations for aerodynamic flows involving complex geometries. Such 
flows include, but are not limited to, transonic flows through engine inlets and those past 
complete aircraft configurations. The research focuses upon the development of implicit 
multigrid algorithms for these problems, treating three-dimensional problems of great geo- 
metrical complexity through the use of multi-block grid systems. The work includes further 
development of mesh systems suitable for three-dimensional inlet and wing geometries, the 
exploration of parallel computing strategies, and demonstrations of the suitability of the 
method to complex engine inlet configurations of interest to NASA. 

The multi-block grid approach involves subdividing the computational domain into a 
number of sub-domains, each of which is topologically equivalent to a rectangular com- 
putational domain. Each sub-domain is chosen to be simple enough that a grid can be 
generated relatively easily. Several grid generation packages are available for generating 
the grids within the blocks [1,2]. The problem of geometric generality is thus transferred 
to that of treating the interfaces between the blocks in the multi-block grid. There are at 
least three advantages to this approach. First is the obvious geometric generality arising 
from the relative ease with which a boundary-conforming grid can be generated within each 
block. A second advantage is the universality of the code which results when the blocks are 
chosen so that the boundary conditions on each face of each block are restricted to be of a 
single type. This means that the flow solver does not need to be modified each time a new 
grid topology is introduced. In effect, the topology of the grid system is embedded within 
the structure of the multiple blocks, which is provided as input to the flow solver from the 
grid generation step. Finally, the multi-block approach provides a natural framework within 
which to implement parallel processing. 

In the following sections, work performed on each aspect of the algorithm will be 
summarized. More complete information is contained in the Appendices, which contain 
copies of papers describing the work in more detail. 

A. Implicit Multigrid Solution of the Euler Equations 

Implicit methods are attractive for the efficient solution of the Navier-Stokes equations on 
the high aspect-ratio grids required for resolution of the flow at large values of the Reynolds 
number, but only if they are efficient enough computationally that the increase in conver- 
gence rate more than compensates for the increased computational work required for the 
solution of an algebraic system of equations in each iteration. The most widely used implicit 
methods for multi-dimensional problems are based upon an approximate factorization of the 
implicit operator into the product of one-dimensional factors (Briley & McDonald [3], Beam 
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Sz Warming [4]). Since each factor then has a bandwidth that is independent of the num- 
ber of unknowns, the computational work per time step is proportional to the number of 
unknowns - i.e., to the number of mesh cells. For the Euler equations in three-dimensions, 
which comprise a hyperbolic system of five coupled first-order partial differential equations, 
this procedure is still relatively expensive since the blocks are 5x5 and the need to include 
fourth-difference numerical dissipation terms leads to the requirement to solve block pen- 
tadiagonal systems of equations along each grid line in each coordinate direction for each 
time step. For steady flows, the procedure can be made much more efficient by performing 
similarity transformations on the Jacobians of the flux vectors in each one-dimensional fac- 
tor, with the result that only a set of five decoupled scalar pentadiagonal equations needs 
to be solved along each line to update the solution for each iteration (time step). (Chaussee 
& Pulliam [5]). 

Even implicit algorithms are usually limited to rather modest values of the Courant 
number in practice, however, and many hundreds, or even thousands, of iterations of- 
ten are required for convergence of the solution on fine grids. The multigrid method is 
an attractive convergence acceleration technique, which is capable of greatly reducing the 
number of iterations required for convergence. Jameson [6] has successfully implemented 
a Full-Approximation Scheme version of multigrid for the Euler equations on structured 
grid systems. The multigrid scheme was first implemented for the explicit, Runge-Kutta 
time-stepping scheme of Jameson et al. [7], and later by Jameson & Yoon [8] using (block) 
ADI as the smoothing algorithm. 

Caughey [9] has developed a diagonalized version of the ADI-multigrid scheme for two- 
dimensional airfoil problems and, with Turkel [10], has developed improved treatments of 
the dissipation terms that introduce less spurious entropy near stagnation points. Yadlin & 
Caughey [11] have extended the algorithm to three-dimensional problems, including com- 
putations of the transonic flow past a swept wing. More recently, Caughey has implemented 
a symmetric TVD form of the numerical dissipation that greatly increases the robustness 
of the scheme [12]. 

B. Multi-Block Grids 

The implementation of implicit multigrid methods on multi-block grids seems relatively 
straightforward, but there are a number of strategic choices that must be made, and it 
is important to assess the effects of these upon the performance of the algorithm. Yadlin 
& Caughey [13] have performed experiments for two-dimensional problems on multi-block 
grids to answer some of these questions before proceeding with three-dimensional implemen- 
tations. The implicit multigrid method outperformed by a considerable margin the implicit 
method of Belk & Whitfield [14], which is based upon an approximate Lower-Upper factor- 
ization of the implicit operator. In addition, we have investigated the speed-up achievable 
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using parallel processing on a large, shared-memory multiprocessing supercomputer (the 
IBM 3090-600E). 

The most important aspect of multigrid on multi-block grids involves the decision of 
whether to perform the multigrid cycles within the blocks independently or to perform the 
multigrid cycles within all blocks concurrently. We have termed the former a ‘vertical’ 
multigrid strategy and the latter a ‘horizontal’ multigrid strategy. The ‘horizontal’ strat- 
egy is feasible on shared-memory computers and promises the best multigrid convergence 
acceleration, but at the cost of considerably more overhead associated with initiating and 
synchronizing parallel tasks (or with communication delays on distributed memory archi- 
tectures). The ‘vertical’ strategy considerably reduces the parallel overhead, but with a 
possible penalty in multigrid efficiency. The ‘horizontal’ multigrid strategy has been ap- 
plied successfully to the solution of the Euler equations on multi-block grids (see, e.g., Yadlin 
[15]), including parallel implementation on the six- processor IBM 3090-600E (Yadlin [15] 
and Yadlin & Caughey [16]). 

An implementation of the ‘vertical’ multigrid strategy for the three-dimensional Euler 
algorithm on multi-block grids which avoids the convergence rate penalties associated with 
earlier attempts has been developed recently by Yadlin &: Caughey [17]. As described 
above, the ability to use the ‘vertical’ strategy, in which the multigrid cycles are advanced 
independently within each block, allows much greater flexibility in grid generation and in 
multigrid strategies for complex geometries; in addition, the parallel implementation of this 
strategy has much less overhead, with correspondingly greater parallel efficiencies. The key 
to maintaining good multigrid performance for the ‘vertical’ strategy is the introduction 
of asynchronous updating of the boundary conditions on the coarser grid levels of the 
multigrid sequence. This is done using buffer arrays which are updated with the latest 
boundary information as soon as it is computed in each block, while each block also reads 
from the buffer arrays of neighboring blocks the latest boundary condition information 
currently available. 

Work also has been done on the incorporation of more general interface conditions on 
the interblock boundaries to account for discontinuities in the grids across these interfaces. 
A fully conservative procedure, which requires knowledge of the solution in only one layer 
of cells on each side of the interface, has been developed. This data is used to compute the 
fluxes across the cells on one side of the interface, then the conservation condition is used 
to infer the fluxes through the cell faces on the other side. A similar procedure is used to 
guarantee that the dissipative fluxes also are treated conservatively. The solution in only 
one layer of cells on each side of the interface is required, even including the evaluation of 
the dissipative terms, since it has been found that the fourth-difference dissipative terms 
can be neglected, and only the second-difference terms need be computed in those cells 
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adjoining the interblock boundaries. The development of this idea in two dimensions has 
presented by Wang & Caughey [18]. More complete results in two dimensions and results 
for a complicated three-dimensional inlet, the geometry and mesh for which were provided 
by NASA researchers, are described by Wang [19] and by Wang h Caughey [20]. 

C. Implicit Multigrid Solution of the Navier-Stokes Equations 

For the two-dimensional Navier-Stokes equations, a study of various ways to include con- 
tributions of the viscous terms in the implicit operator, while maintaining the diagonal 
form of the algorithm, has been performed by Tysinger Sz Caughey [21, 22]. The results 
of that study showed that several approximate treatments of the viscous contributions to 
the implicit operator could increase the stability of the scheme beyond that of a scheme 
in which the viscous terms were treated only explicitly. In particular, incorporation of a 
simple diagonal approximation to the contributions of the viscous terms results in a robust 
scheme at little additional cost. A three-dimensional, multiblock implementation of the 
Navier-Stokes algorithm has also been described by Yadlin, Tysinger & Caughey [23]. 

A two-dimensional version of the multi-block version of the implicit viscous algorithm 
has been further developed and has been implemented on a distributed-memory, multi- 
processing system consisting of a ring of IBM RS/6000 work stations interconnected on a 
high-speed optical network. The implementation was performed in a highly modular fash- 
ion, using general Unix socket-based utilities, which feature makes the code easily portable 
to any UNIX-based system. A complete description of this work is given by Tysinger [24] 
and Tysinger &: Caughey [25]. 

Incorporation of the turbulence model of Baldwin & Lomax [26] into the viscous code 
has allowed the computation of flows at high Reynolds number. Varma &; Caughey [27] 
describe the results of these turbulent flow calculations, demonstrating that the multigrid 
efficiency remains high even on the highly-stretched meshes required to resolve these high 
Reynolds number flows. Careful consistency checks on these solutions have suggested a 
way in which to evaluate the integrated effect of numerical dissipation on the accuracy 
of the solution. This idea, along with examples demonstrating its application to laminar 
and turbulent flow calculations, has been described by Varma & Caughey [28]. Complete 
details of the turbulent flow computations, including the accuracy-assessment procedure, 
are described by Varma [29]. 

Finally, a survey paper summarizing all the algorithmic work on the Euler and Navier- 
Stokes problems has been written by Caughey [30]. 
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Abstract 

A symmetric Total-Variation-Diminishing (TVD) for- 
mulation of the numerical dissipation terms has been 
incorporated into a diagonalized alternating direction 
implicit multigrid algorithm to solve the Euler equa- 
tions of inviscid compressible flow. The new treatment 
of the dissipation makes is possible to capture both 
very strong and very weak shocks, virtually without 
oscillation for the steady flows of interest here. In ad- 
dition, the TVD constraint fixes one of the two previ- 
ously arbitrary constants in the formulation of the dis- 
sipation, and results in both converged solutions and 
convergence rates which are relatively insensitive to the 
choice of the remaining dissipation parameter. 

I. Introduction 

The finite-volume approach provides an attractive 
framework for approximating the spatial derivatives 
appearing in systems of conservation laws, such as the 
Euler equations of compressible flow. The popular 
scheme of Jameson et al 1 reduces to a central differ- 
ence approximation on uniform cartesian grids, how- 
ever, and numerical dissipation terms must be added to 
stabilize the scheme when solving hyperbolic systems 
of equations. The added terms are generally chosen in 
the form of an adaptive blend of second and fourth dif- 
ferences of the solution in each coordinate direction, 
modulated so that the second differences are active 
near discontinuities, with the fourth differences pro- 
viding only a background dissipation in regions away 
from shocks. A sensor, based on an undivided second 
difference of the pressure, is used to modulate these dis- 
sipative terms. The dissipative terms are scaled with 
the spectral radii of the flux- vector Jacobians in each 
coordinate direction (see, e.g., Caughey 2 ), but also are 
proportional to constants which must be chosen some- 
what arbitrarily. 

The connection between the dissipation terms added 


to central difference schemes and the implicit dissipa- 
tion contained in upwind methods has been discussed 
by Pulliam 3 , and Swanson & Turkel 4 have shown how 
these dissipation terms can be chosen to duplicate the 
properties of upwind TVD schemes. The essential el- 
ements in such a symmetric TVD dissipation scheme 
are: (1) an improved shock sensor, and (2) the formula- 
tion of the coefficients of the dissipation terms as matri- 
ces. The former allows the numerical scheme to reduce 
to the same first-order approximation in the vicinity 
of shocks of nearly arbitrary strength, while the lat- 
ter allows the dissipation terms to be scaled properly 
for each individual equation when treating a system of 
conservation laws. 

For the explicit Runge-Kutta time-stepping scheme 
used by Swanson & Turkel 4 and Jorgenson & Turkel 5 , 
the additional cost of evaluating the matrix viscosi- 
ties is considerable. When a diagonalized alternating 
direction implicit (ADI) scheme is used to march the 
equations in time, however, the additional computation 
associated with implementing the matrix form of vis- 
cosities is less significant, since the modal matrices of 
the flux- vector J acobians have already been calculated 
in order to diagonalize the factors of the ADI scheme. 

In the following section, the formulation of the ma- 
trix viscosities and their TVD implementation are de- 
scribed. Results are then presented to illustrate the 
effect of the matrix and TVD viscosities on solutions 
containing shocks of varying strengths. Convergence 
histories are also presented for several cases to illus- 
trate the efficiency of the multigrid implementation. 


II. Formulation 


The Euler equations in two space dimensions can be 
written in the generalized coordinates £ and 77 in the 
form 


3w df dg 

dt di dr/ 


= 0 , 


( 1 ) 


j 
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where w, g and f are four- vectors representing the con- 
served variables and the fluxes in the £ and q coordi- 
nate directions, respectively. In smooth regions of the 
flow, this system of equations is equivalent to the quasi- 
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linear system 


dw . dw 

^r +A W + B W =01 


( 2 ) 


where A = {<9f/3w} and B = {dg/dw} are the Ja- 
cobians of the flux- vectors f and g with respect to the 
solution vector w . 

Since the Euler equations are hyperbolic, both A and 
B can be diagonalized, though not, in general, by the 
same similarity transformation. That is, Eqs.(2) can 
be written in the form 

-£j- + QaAaQ^ 1 ^- + QbAbQs 1 ^- = 0, (3) 


where A^ and A b are diagonal matrices, and and 
Q b are the modal matrices of the, flux- vector Jacobians 
A and B, respectively. 

A symmetric finite-volume approximation to these 
equations, of the form introduced by Jameson ti al 1 , 
must be stabilized by the addition of terms represent- 
ing numerical dissipation. These are usually incorpo- 
rated in a directionally split form by adding approx- 
imations to second and fourth derivatives in each of 
the coordinate directions. The form of the dissipation 
terms can be described for the one-dimensional prob- 
lem, say in the £ direction. 

For this one-dimensional problem, the difference equa- 
tions for the original scheme of Jameson et al 1 can be 
interpreted as an approximation to 


dw df 
~dt + dt 


a <"< a >! G 


( 2 ) 


dw 

~di 



where p(A) is the spectral radius of the Jacobian ma- 
trix A, and the coefficients and are functions of 
gradients in the solution. These coefficients are defined 
in terms of constants and such that 


magnitude of the shock sensor depends on the strength 
of the shock, and the same scaling is used for all equa- 
tions in the system. This latter difficulty might be 
expected to cause no particular problems for transonic 
flows, for which all but one of the eigenvalues have 
roughly the same magnitude (to within a factor of two). 

The new formulation, based on the work of Swanson 
& Turkel 4 and Jorgensen <k Turkel 5 , addresses both 
these issues. The first step in constructing the im- 
proved scheme is to replace the scalar coefficient of the 
dissipative terms appearing in Eq.(4) by a matrix co- 
efficient. The difference equations for the new scheme 
can be interpreted as central difference approximations 
to 


dw df _ 

~dt + d( ~ 



Q^lA^jQ 


-l 

A 




where | | is a diagonal matrix, the elements of which 

are the absolute values of the eigenvalues of A. Com- 
parison of Eq.(8) with Eq.(4) shows that the earlier 
scheme can be interpreted as one in which the matrix 
| A>% | is replaced by p(A)I, where I is the identity ma- 
trix. Thus, the original (scalar viscosity) scheme can 
be seen to have the viscosity for all equations in the 
system controlled by the largest eigenvalue, while the 
matrix viscosity incorporates dissipative terms which 
are scaled according to the appropriate eigenvalue for 
each equation in the system. In practice, the matrix 
| A^ 4 1 must be modified so that the dissipative coeffi- 
cients do not become zero when individual eigenvalues 
vanish (e.g., at stagnation and/or sonic points). This 
is achieved by redefining the elements of the diagonal 
matrix according to 

\A a \ = Diag{max(rp(A), |A,|)}, (9) 


4 2) = «(*> 


Vi, 


( 5 ) 


where 
P by 


the switch i/, is defined in terms of the pressure 


\pi±i ~ 2p,- +p,-_il 
P. + l + 2p; +p,_i ’ 


( 6 ) 


and is designed to activate the second difference terms 
in the vicinity of shock waves. Also, 


4 4) = max {o., A£ 2 [k ( 4) - e[ 2) ] } (7) 


is designed to provide a nearly constant background 
level of fourth-difference dissipation, except in the 
vicinity of shocks, where the coefficient is set to zero 
when ^ becomes of order unity. While this form of dis- 
sipation is relatively efficient computationally, it can- 
not be expected to be effective in all cases since the 


where A; are the eigenvalues of A, and r is a small 
constant, typically chosen to be 0.20. The implemen- 
tation of the scheme defined according to Eqs.(8) that 
is most nearly analogous to the original scalar model 
retains Eqs. (5) - (7) to define the coefficients of the 
dissipation terms; we term this the matrix dissipation 
scheme. 

The matrix dissipation scheme defined above tends to 
introduce less spurious dissipation into solutions, espe- 
cially on relatively coarse grids, as would be expected. 
At the same time, because of its reduced dissipation 
solutions containing shock waves can suffer from ex- 
cessive oscillations in their vicinity. To reduce this ten- 
dency, the coefficients controlling the dissipation can be 
redefined in such a way that near shock waves the dif- 
ferences reduce exactly to first-order accurate one-sided 
differences, at least in the quasi-linear approximation 
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considered here; for this reason, this choice of the scal- 
ing coefficients is here referred to as the TVD matrix 
form of dissipation. For this purpose, the coefficients 


are redefined as 


,(2) _ il 

C C — ^ > 


where 


Vi = 


b.'+i - 2pi + Pi-i | 

|p»+i -Pi\ + \Pi-Pi-i\ + &' 


( 10 ) 


( 11 ) 


Here S is a constant of 0(A£ 2 ) which is used to prevent 
activation of the switch in smooth regions of the flow 
where the pressure is nearly a constant. Similarly, 


= /c^Af 2 max (0., 1. — 2i/,) . (12) 

The new shock sensor i /,• is small in regions in which 
the flow is smooth, but takes on a maximum value of 
unity, independent of shock strength, at local extrema 
of the pressure field. 

These new forms of dissipation have the effect of ap- 
proximating the system of Eqs.( 8 ) as 


(13) 

where v = is the vector of characteristic vari- 

ables corresponding to the one-dimensional problem in 
the £ coordinate direction. From the form of Eqs.(13), 
it can be seen that the TVD formulation reduces to 
a simple upwind approximation near shocks (where 
= 1/2 and = 0 .), but has only background 
fourth-difference dissipation in smooth regions. It is 
relatively easy to incorporate a sensor which tests sep- 
arately for extrema in the appropriate characteristic 
variable for each equation in the decoupled system cor- 
responding to Eqs.(13), but for the steady transonic 
flows of interest here, this seems to have little added 
benefit. 


dv . dv d 

~dt +AA J( ~ A *d( 


[| A J />:>— -e« — 


For the two-dimensional flows considered here, dissipa- 
tive terms must also be added in the 77 coordinate di- 
rection, and are defined analogously to those described 
above for the £ direction. Thus, the equations have the 
form 


3 w di <9g _ 

~dt + d£ + Ihj = 


l 
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where the notation for the dissipative terms in the 77 - 
direction is directly analogous to that used earlier for 
the dissipative terms in the ^-direction. 

Also, even when the matrix form of dissipation is used, 
but particularly for the TVD matrix form, excessive 


spurious entropy can be generated at stagnation points, 
unless the grid is excessively fine. Since the flow is 
smooth in these regions, the second difference dissipa- 
tion is not needed there, and can be reduced. In the 
present implementation, this is achieved by multiply- 
ing the switching function V{ in either Eq.(5) or Eq.(10) 
by the square of the ratio of the local Mach number to 
that in the free stream. The numerical implementation 
corresponding to Eqs.(14) requires that e^ 2 ) be defined 
at the cell faces, and the Mach number used for this 
scaling is taken to be the maximum of the averages 
in the two cells sharing the face; this was shown by 
Caughey & Turkel 6 to result in less oscillation in en- 
tropy near shock waves for the original scalar form of 
the dissipation. 

The spatial discretization of these equations is based 
upon the finite-volume approximation of Jameson et 
al 1 . For the purpose of computing the Euler fluxes, the 
flow variables are assumed to be constant on each cell 
face, and are taken to be the average of the cell average 
quantities in the two cells sharing the face. The fluxes 
are determined by multiplying the fluxes per unit area 
thus determined by the projected areas of the cell face 
in the appropriate coordinate directions. The dissipa- 
tive fluxes are computed as simple finite differences, as 
described by Jameson et al 1 and Caughey 2 . 

The implicit, diagonalized ADI scheme is essentially 
unchanged from that described by Caughey 2 . In that 
formulation, the equations are linearized to allow ap- 
proximation of the spatial derivatives as weighted aver- 
ages of differences at the old and new time levels. The 
time-linearized implicit operator is then approximated 
as the product of two one-dimensional factors. In or- 
der to improve computational efficiency, each of these 
implicit factors is diagonalized using a local similarity 
transformation, following Chaussee & Pulliam 7 . The 
modal matrices required to achieve this diagonaliza- 
tion are the same as those used above to implement the 
matrix-based forms of dissipation, whence they need to 
be calculated only once per time step. The resulting 
implicit scheme requires the solution of four scalar pen- 
tadiagonal systems along each line in each of the two 
mesh directions for each time step. 

The implicit scheme is implemented within the frame- 
work of the multigrid method, as described by 
Jameson 8 and by Caughey 2 , to further accelerate con- 
vergence to the steady state. A simple fixed-strategy 
saw-tooth cycle in which one time step is performed 
on each grid before the solution is restricted to the 
next coarser grid has been used. For this simple strat- 
egy, each multigrid cycle requires less than 1 1/3 Work 
Units, if a work unit is defined as the work required 
for one time step on the finest grid and the overhead 
associated with restriction of the solution and residuals 
to coarser grids and of prolongation of the corrections 
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NACA 0012 -- Matrix Dissipation 
Mach 0.8000 Alpha 1.250 

Cl 0.3605 Cd 0.0230 Cm -0.0392 

Grid 160x32 Work 400.61 Res 0.961E-07 
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NACA 0012 — TVD Matrix Dissipation 
Mach 0.8000 Alpha 1.250 

Cl 0.3534 Cd 0.0230 Cm -0.0386 

Grid 160x32 Work 400.61 Res 0.116E-10 


Fig. 1 Airfoil surface pressure distribution; NACA Fig. 2 Airfoil surface pressure distribution; NACA 

0012 airfoil at free stream Mach number 0.80 and 1.25 0012 airfoil at free stream Mach number 0.80 and 1.25 

degrees angle of attack; matrix dissipation. degrees angle of attack; TVD matrix dissipation. 


to finer grids is neglected. Both the matrix and TVD 
matrix implementations of the dissipation terms are 
more robust than the original scalar form; when the 
scalar dissipation model was used, it was found to be 
necessary to run the smoothing steps on coarser grids 
of the multigrid sequence at a Courant number equal 
to half its value on the fine grid; for both the matrix 
implementations, the same value of Courant number 
typically is used on all grids in the multigrid sequence. 

III. Results 

Results are presented here for steady transonic flows 
past airfoils to illustrate the shock-capturing capabili- 
ties of the schemes, as well as the convergence prop- 
erties of the multigrid algorithm. The calculations 
are performed on “0”-type grids, usually containing 
160 x 32 cells in the wrap-around and body-normal 
directions, respectively. The mesh extends from the 
airfoil surface to a nearly circular far field boundary 
located approximately 30 chord lengths from the body; 
the ratio of areas of the largest to smallest cells in 
the mesh is approximately 5 x 10 7 . All calculations 
presented here have been computed using local time- 
stepping at a constant Courant number of C = 8.0 on 
all meshes in the multigrid sequence. Five levels of 
multigrid were used for all calculations presented here, 
with the coarsest grid containing 10 x 2 cells. Several 
levels of grid sequencing were used to start the calcu- 
lations on the finer grids, i.e., initial conditions were 
obtained by interpolating from converged solutions on 


coarser grids having half as many cells in each coordi- 
nate direction. 

Fixed values of the dissipation coefficients were used 
for all cases presented here. For the matrix dissipation 
scheme, values of = 4.0 and = 0.0625 were 
used; for the TVD matrix dissipation scheme, a value 
of = 0.125 was used. 

The first result is for the flow past the NACA 0012 j 
airfoil at a free stream Mach number of 0.80 and 1.25 
degrees angle of attack. For these conditions, a mod- 
erately large pocket of supersonic flow forms in the 
region above the airfoil upper surface, terminated by a 
rather strong shock, while a small supersonic zone ter- 
minated by a very weak shock forms below the lower 
surface. This solution was calculated using both the 
matrix and TVD matrix forms of the numerical dissi- 
pation. Figures 1 and 2 show the airfoil surface pres- 
sure distributions for the calculations using the matrix 
and TVD matrix dissipation, respectively. The solu- 
tions are nearly the same, except for the fact that the 
weak shock on the lower surface of the airfoil is more 
smeared using the TVD form. On the other hand, the 
solution is more highly oscillatory in the vicinity of the 
shock for the solution obtained using the matrix viscos- 
ity. This is visible in the pressure field, but shows more 
clearly in the entropy. Figures 3 and 4 show contours 
of constant entropy for these two cases; the contour 
spacing in these figures corresponds to 0.20 per cent 
loss in total pressure. 
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Fig. 3 Contours of constant entropy; flow past 
NACA 0012 airfoil at free stream Mach number 0.80 
and 1.25 degrees angle of attack; matrix form of dissi- 
pation. 


Similar results are shown for the flow past the NACA 
0012 airfoil at the same angle of attack and a free 
stream Mach number of 0.85 in Figures 5 - 8, and sim- 
ilar conclusions can be drawn. For this case, however, 
both shocks are considerably stronger, and even the 
shock near the airfoil lower surface is captured crisply 
by both schemes. The entropy contours shown in Fig- 
ures 7 and 8, again plotted at levels corresponding to 
0.20 per cent total pressure loss, show considerable os- 
cillation for the matrix dissipation scheme, and are 
smoother for the TVD version, but with more spuri- 
ous entropy generated at the leading edge. 

The spurious entropy generated near the leading edge 
stagnation point can, of course, be reduced by refining 
the mesh. Figures 9 and 10 show the airfoil surface 
pressure distribution and contours of constant entropy- 
are shown for the Mach 0.85 and 1.25 degree angle of 
attack case on a finer grid containing 320 x 64 cells. 
On this grid, both shocks are captured crisply, and 
there is very little spurious entropy generated at the 
leading edge stagnation point even when the TVD form 
of dissipation is used. 

Finally, to illustrate the performance of the TVD ma- 
trix dissipation scheme at higher Mach numbers, a flow 
field having a supersonic free stream is presented. The 
flow past the NACA 0012 airfoil at 2.0 degrees angle 
of attack and a free stream Mach number of 2.0 has 
been computed using the TVD matrix dissipation; nei- 
ther thlToriginal scalar dissipation model nor the ma- 
trix dissipation model converged for this case at the 


.04200 
04000 ■■ 

03600 m 

03600 Hi 

03400 H 
03200 H 

03000 m 
02600 m 
.02600 m 

.02400 m 

° 2soc Hi 

02000 H 
.01600 H 
.01600 
.01400 m 

0120C m 

01000 Hi 
.00600 ■■ 

.00600 m 
,oc4oo m 

00200 Hi 

C.200CE-02 
o.4i2ec-oi 
0. 2000 E- 02 

Fig. 4 Contours of constant entropy; flow past 
NACA 0012 airfoil at free stream Mach number 0.80 
and 1.25 degrees angle of attack; TVD matrix form of 
dissipation. 


values of Courant number normally used. Figure 11 
presents contours of constant pressure coefficient for 
this case; the interval between contour lines plotted 
is A C p = 0.025. The pressure contours clearly show 
the strong bow shock and the weak (supersonic-to- 
supersonic) shocks from the airfoil trailing edge. 

The convergence histories for several of these calcu- 
lations will now be presented. The logarithm of the 
residual (the average of |Ap/At| over all cells in the 
grid) is plotted as a function of work units, where one 
work unit is defined as the amount of computational 
work required for one time step on the finest grid of 
the multigrid sequence. The lift and drag coefficients 
and the number of cells in which the Mach number is 
supersonic are also plotted (on arbitrary scales). These 
three latter quantities are good global measures of the 
convergence of the iteration. Figures 12 - 14 show the 
convergence histories for the three flow fields presented 
above using the TVD matrix dissipation. The conver- 
gence rates are similar for all three cases, but is some 
what slower and more oscillatory for the Mach 0.85 
case. In all three cases the lift and drag coefficients 
and the number of cells in which the local Mach num- 
ber is supersonic have converged to within plottable 
accuracy for these figures in about 100 Work Units. 

While the use of TVD schemes is generally thought to 
be most useful for cases involving very strong shocks, 
they are also valuable for avoiding oscillations in the 
vicinity of very weak shocks. This is illustrated here 
for the flow past a very thin profile at a Mach number 
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Mach 0 8500 Alpha 1.250 

Cl 0.449? Cd 0.0631 Cm -0 1518 

Grid 160x32 Work- 400.61 Res O.382E-06 

Fig. 5 Airfoil surface pressure distribution; NACA 
0012 airfoil at free stream Mach number 0.85 and 1.25 
degrees angle of attack; matrix dissipation. 

very near unity. These flows were of interest in recent 
experiments to verify the transonic similarity law of 
Karman 9 . A profile was generated by maintaining a 
constant value of the transonic similarity parameter 

^ 1 - M 2 

" M 4 / 3 r 2 / 3 ’ 

where M is the free stream Mach number and r is the 
airfoil thickness ratio. Here, a reference case is taken 
to be the symmetric flow past the NACA 0012 profile 
(r re y = 0.12) at a reference Mach number M re j = 0.85. 
For the similar flow at a Mach number of 0.975, Har- 
man’s rule requires a profile having a thickness ratio 
of only 0.00685. Calculations for this flow have been 
performed on grids having 320 x 64 cells using both the 
original scalar dissipation and the TVD matrix model. 
The airfoil surface pressure distributions for these solu- 
tions are plotted in similarity form in Figs. 15 and 16. 
The similarity form of the pressure coefficient plotted 
here is defined as 



The oscillations in pressure coefficient in the vicinity 
of the shock for the original scheme are apparently a 
result of the failure of the shock sensor in the original 
scheme to detect the presence of this extremely weak 
shoclT- note that the minimum value of the (un-scaled) 
pressure coefficient immediately ahead of the shock for 
this case is only about —0.12. The oscillations in the 
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NACA 0012 — TVD Matrix Dissipation 

Mach 0.8500 Alpha 1250 

Cl 0.4352 Cd 0.0621 Cm -0.1467 

Grid 160x32 Work 400.61 Res 0.141E-08 

Fig. 6 Airfoil surface pressure distribution; NACA 
0012 airfoil at free stream Mach number 0.85 and 1.25 
degrees angle of attack; TVD matrix dissipation. 

vicinity of the shock are almost completely absent with 
the improved scheme. 

Finally, it is worth emphasizing that all calculations 
presented here using the TVD matrix dissipation model 
have been performed with the same value of the dissi- 
pation parameter kC). j n a one- dimensional version of 
the scheme written to compute quasi-one-dimensional 
flows in nozzles with shocks, the TVD matrix scheme 
converges well for values of «C) ranging over several 
orders of magnitude, and produces solutions which v 
are virtually independent of the value of this param- 
eter. The two-dimensional implementation does not 
yet perform well for so broad a range of values, but 
the problems seem to be near stagnation points, not 
shock waves, so there seems to be some scope for fur- 
ther improvement of the two-dimensional version of the 
algorithm. 

IV. Concluding Remarks 

Two forms of matrix artificial viscosity have been incor- 
porated into an implicit multigrid algorithm for solving 
a finite-volume approximation to the Euler equations 
of compressible fluid flow. The simplest matrix form 
introduces less spurious total pressure loss, but is prone 
to oscillation, particularly noticeable in the entropy, in 
the vicinity of shock waves. The TVD matrix form cap- 
tures weak shocks less crisply, but is more robust for 
computing flows having shock waves of widely varying 
strengths over a range of free stream Mach numbers. 
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Fig. 7 Contours of constant entropy; flow past 
NACA 0012 airfoil at free stream Mach number 0.85 
and 1.25 degrees angle of attack; matrix form of dissi- 
pation. 


Fig. 8 Contours of constant entropy; flow past 
NACA 0012 airfoil at free stream Mach number 0.85 
and 1.25 degrees angle of attack; TVD matrix form of 
dissipation. 
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NACA 0012 — TVD Matrix Dissipation 
Mach 0.8500 Alpha 1.250 

Cl 0.4521 Cd 0.0633 Cm -0.1532 

Grid 320x64 Work 400.90 Res 0.145E-06 

Fig. 9 Airfoil surface pressure distribution; NACA 
0012 airfoil at free stream Mach number 0.85 and 1.25 
degrees angle of attack; 320 x 64 cell grid using TVD 
matrix dissipation. 


Mach - 2.C000 Pressure contours Maximum - 0.l«71t*0J 

Alpha - 1.2600 Inercnnt - 02300C-OI 

Fig. 11 Contours of constant pressure coefficient; 
flow past NACA 0012 airfoil at free stream Mach num- 
ber 2.00 and 2.00 degrees angle of attack; TVD matrix 
dissipation. 
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Mach 0.800 Alpha 1.250 

Resl 0.268E-01 CFL 8.00 
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Fig. 10 Contours of constant entropy; flow past 
NACA 0012 airfoil at free stream Mach number 0.85 
ancL1.25 degrees angle of attack; 360 x 64 cell grid using 
TVD matrix dissipation. 


Fig. 12 Iteration history for NACA 0012 airfoil 
at 0.80 Mach number and 1.25 degrees angle of at- 
tack; TVD matrix dissipation. Note: force coefficients 
and number of supersonic cells are plotted on arbitrary 
scales. 
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Fig. 13 Iteration history for NACA 0012 airfoil 
at 0.85 Mach number and 1.25 degrees angle of at- 
tack; TVD matrix dissipation. Note: force coefficients 
and number of supersonic cells are plotted on arbitrary 
scales. 


Fig. 15 Scaled pressure distribution for flow past 
similarity-scaled profile; reference case is NACA 0012 
airfoil at 0.85 Mach number; original scalar dissipation. 
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Similarity Scaled NACA 0012 Airfoil 
Mach .9750 Alpha .000 

Cl .0000 Cd .0615 Cm .0000 
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Fig. 14 Iteration history for NACA 0012 airfoil at 
2.0 Mach number and 2.0 degrees angle of attack; TVD 
matrix dissipation. Note: force coefficients and number 
of sup??sonic cells are plotted on arbitrary scales. 


Fig. 16 Scaled pressure distribution for flow past 
similarity-scaled profile; reference case is NACA 0012 
airfoil at 0.85 Mach number; TVD matrix dissipation. 




Block-Multigrid ADI scheme has been developed for three-dimensional prob- 
lems. The scheme uses the horizontal mode of multigrid, and can run on 
a shared-memory parallel computer. Computations of transonic flow past a 
swept wing illustrate the accuracy and efficiency of the scheme. Speed-up re- 
sults are presented to illustrate the ability of the scheme to calculate complex 
flow's in the short turn-around time required in any design application. 


BLOCK IMPLICIT MULTIGRID SOLUTION OF THE EULER 

EQUATIONS 
Yoram Yadlin, Ph.D. 

Cornell University 1990 

A Diagonal Implicit Multigrid (BDIM) scheme has been developed to 
solve the Euler equations of inviscid, compressible flow in three-dimensions, 
and has been implemented within the framework of block-structured grids. 
The work described has been developed along the following path: First a 
multigrid ADI scheme was developed for a single-block grid in three dimen- 
sions, using a diagonalization procedure resulting in a computationally effi- 
cient code; the scheme has been applied to compute transonic flow past a 
swept wing and found accurate and efficient. 

Second, ways to implement the multigrid ADI scheme on block-structured 
grids have been investigated. Two modes of multigrid cycles have been de- 
veloped: one in which the multigrid cycle advances concurrently on all blocks 
(horizontal mode) and one in which the multigrid cycle advances indepen- 
dently in each block (vertical mode). The efficiency and accuracy of both 
modes has been investigated by applying the schemes to compute transonic 
flow past the NACA-0012 airfoil. Both modes have been implemented to run 
on a shared-memory parallel computer and resulting speed-ups have been 
presented and discussed. 

Finally, based on the results of the two-dimensional implementation, the 
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Parallel Computing Strategies for Block Multigrid Implicit 
Solution of the Euler Equations 

Yoram Yadlin* and David A. Caugheyf 
Cornell University, Ithaca, New York 14853 


A multigrid diagonal implicit algorithm has been developed to solve the three-dimensional Euler equations of 
inviscid compressible flow on block-structured grids. An improved method of advancing the multigrid cycle has 
been examined with respect to convergence rates, accuracy, and efficiency. In this method, the multigrid cycle 
is advanced independently in each of the blocks, and the information exchange between the blocks is done using 
buffer arrays, allowing for the asynchronous updating of interface boundary conditions. This updating scheme 
is used to eliminate the convergence problems found in a previous implementation of the algorithm while 
retaining its potential for efficient parallel execution. Results are computed for transonic flows past wings and 
include pressure distributions to verify the accuracy of the scheme and convergence histories to demonstrate the 
efficiency of the method. Efficiencies that were obtained using a modest number of processors in parallel are 
also presented and discussed. 


Introduction 

W HEN one tries to design an algorithm for the simula- 
tion of flow around realistic three-dimensional aerody- 
namic configurations, a major difficulty is the generation of 
an appropriate grid on which to compute the solution. One 
approach to this problem is the use of composite block-struc- 
tured grids. 1 In this approach, the physical domain is divided 
into a set of subdomains; in each subdomain, relatively simple 
grids can be generated. In its most general implementation, 
the subdomains can be adjacent to each other or overlapped 
and have different degrees of continuity at the interfaces. The 
block-structured grids also allow different governing equa- 
tions to be solved on different blocks according to the physical 
characteristics of the flow. The block-structured grid, in a 
natural way, also allows the use of a multiprocessor computer 
since the solution on several blocks can be computed concur- 
rently, resulting in a faster turn-around time. 

Flow solvers that take advantage of block-structured grids 
have been developed in the last few years, employing both 
explicit 3 " 7 and implicit schemes. 8 ' 11 In Ref. 12, an implementa- 
tion of the diagonal implicit multigrid (DIM) algorithm 13 - 14 on 
block-structured grids has been described and implemented 
for two-dimensional problems. The implementation of the 
multigrid scheme has been done in two modes: a horizontal 
mode, in which the multigrid cycle is kept in phase in all the 
blocks; and a vertical mode, in which the multigrid cycle is 
advanced independently in each block. Both modes were ex- 
amined with regard to their accuracy and efficiency in serial 
and parallel execution, and it was found that, although the 
vertical mode shows more potential for high parallel efficiency 
(e.g., less synchronization and overhead), it exhibits conver- 
gence problems. These findings led to the implementation of 
only the horizontal mode in a three-dimensional version of the 
code. 15 

In this paper, an implementation of the vertical mode in a 
three-dimensional code that addresses these convergence prob- 
lems will be described. First the numerical algorithm will be 
described, followed by a description of the implementation of 
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multigrid on block-structured grids. Next, the implementation 
of the code on parallel computers will be discussed, and fi- 
nally, results of calculations of flows past three-dimensional 
geometries will be presented in order to demonstrate the accu- 
racy and efficiency of the algorithm and its applicability for 
parallel execution. 

Description of Algorithm 

The block diagonal implicit multigrid (BDIM) algorithm is 
a direct extension of the DIM algorithm described in Refs. 13 
and 14. The Euler equations are approximated using a cell- 
centered finite volume spatial discretization on the mesh cells 
corresponding to a boundary-conforming curvilinear coordi- 
nate system. Artificial dissipation is added as a blend of sec- 
ond and fourth differences of the solution; the fourth differ- 
ences are necessary to ensure convergence to a steady state, 
and the second difference terms are introduced to prevent 
excessive oscillation of the solution in the vicinity of shock 
waves. The time-linearized implicit operator is approximated 
as the product of three one-dimensional factors, each of which 
is diagonalized by a local similarity transformation, so that 
only a decoupled system of scalar pentadiagonal equations 
needs to be solved along each line. The resulting method has 
good high wave-number damping and thus is a good smooth- 
ing algorithm to be used in conjunction with the multigrid 
method. In the following sections, those aspects relevant to 
the implementation of the DIM algorithm on block-structured 
grids will be described. Most of the descriptions will be for 
problems in two dimensions with references to the appropriate 
extensions to three dimensions. 

Domain Decomposition 

The physical domain is divided into subdomains, which are 
represented by rectangular blocks in the computational do- 
main. Each block has four faces (six in three dimensions), with 
its own coordinate system (£, ri) in the computational space. 
The faces are numbered as illustrated in Fig. 1. 

Each block is defined with two layers of dummy cells 
around it, which are used to enforce the boundary conditions; 
when two faces of neighboring blocks coincide, these layers 
create a region of overlap. The information required at each 
face is as follows: 1) block number, 2) type of boundary 
conditions, and 3) neighboring face number (if applicable) and 
the orientation of its coordinate system. This information is 
stored in a set of two-dimensional integer arrays, created as 
input for the flow solver by the grid generation code. In the 
present implementation, it is required that the grid distribu- 
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tion on coincident faces be the same in both blocks sharing the 
face. 

Boundary Conditions 

Each face can have one of the following types of boundary 
conditions: 1) solid surface, 2) far field, or 3) interface. In the 
present implementation, the type of boundary condition must 
be homogeneous over each block face. For each step of the 
multigrid cycle in which an update to the boundary conditions 
is required, the code loops through the blocks and the faces 
within each block and updates either the solid surface or the 
far-field boundary conditions accordingly. The update of the 
interface boundary conditions is done by introducing a set of 
surface arrays, which act as buffer arrays between the blocks. 
Each block has a surface array that holds a copy of the 
solution vector in the two inner layers; from this array, data 
will be read into the layers of dummy cells of the adjacent 
blocks (see Fig. 2). • 

The treatment of the boundary conditions on solid surfaces 
and in the far field is the same as in the original DIM scheme. 13 
The implicit boundary conditions are treated in a manner 
consistent with the characteristic theory. At each face, the 
appropriate eigenvalues are calculated and used to determine 
the directions of the characteristics for the one-dimensional 
problem normal to the boundary. The boundary conditions 
for those elements of the solution vector that correspond to 
characteristics entering the domain are taken to be homoge- 
neous Dirichlet conditions, whereas the boundary conditions 
on those elements that correspond to characteristics leaving 
the domain are taken to be homogeneous Neumann condi- 



Physical Domain 



Computational Domain 


Fig. 1 Decomposition of physical and computational domains into 
blocks. 



Fig. 2 Surface array. 


tions. Note that there is no extra cost for this implicit charac- 
teristic boundary condition treatment since the calculation of 
the eigenvalues is already required for the construction of the 
coefficient matrix. 

Since the locations of the block boundaries are determined 
independently of the solution, it is possible that large gradients 
(including numerically smeared shocks) may occur in the 
vicinity of the boundaries. This imposes a requirement that no 
approximations be made in the evaluation of the residuals 
near these artificial boundaries. In particular, the treatment of 
the boundary conditions for the dissipation terms is crucial 
since these have the largest difference stencil. In the basic 
algorithm, the dissipative terms include fourth differences of 
the solution, hence each block is surrounded by two layers of 
dummy cells (an increase of 10% in memory requirement for 
a block sized 64 x 64 x 64). 

Data Structure 

The data structure for the composite block-multigrid al- 
gorithm is an extension of the multigrid data structure used 
for a single block. All of the flow variables, coordinates, cell 
areas (volumes in three dimensions), time steps, etc., are 
stored in one-dimensional arrays. The arrays are organized by 
blocks; the unknowns of the first block are stored at the 
beginning of the array, followed by those of the second block, 
and so on. Within each block, the data is organized by grid 
levels, as is the case in the single-block multigrid scheme. The 
surface array data structure follows the same arrangement, 
where at each grid level, data is organized by faces, starting 
with face 1 (£= 1 surface) and ending with face 6 (f=£max 
surface), as illustrated in Fig. 3. 

The position of the first entry for each block is calculated 
according to the number of blocks and grid levels in the 
multigrid cycle and is stored in an integer array. This data 
structure allows access to each block independently and re- 
quired no synchronized input/output when the algorithm is 
executed in parallel. 

The only exceptions are the surface arrays; for these, the 
possibility exists that one block will try to write into a surface 
array while the adjacent block is reading from it. To avoid this 
possibility, a locking mechanism is used in the parallel code, 
which allows access to the array by only one block at a time. 

Multigrid 

The multigrid scheme on a composite block structured grid 
can be implemented in at least two modes: 1) horizontal — the 
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Fig. 3 Data structure. 


multigrid cycle is advanced in phase in all of the blocks, or 2) 
vertical — the multigrid cycle is advanced independently in 
each of the blocks. The main difference between the two 
modes is the degree of interaction between the blocks during 
the multigrid cycle. In the horizontal mode, all of the blocks 
are in phase during the cycle, hence the data exchange between 
the blocks (i.e. , the updating of boundary conditions on inter- 
faces) can be done easily at each grid level in the cycle. On the 
other hand, in the vertical mode, the blocks are synchronized 
only at the beginning/end of each cycle, allowing for data 
exchange only once in the cycle, resulting in the freezing of the 
boundary conditions on interfaces during the entire cycle. As 
discussed in Ref. 12, this updating scheme results in poor 
convergence rates, probably due to the fact that the interface 
boundary conditions on the coarse grids are not the correct 
ones (compared to single block or the horizontal mode multi- 
grid), and the relaxation operations spread these disturbances 
into the interior of the block, thus impairing smoothing. 

One way to improve on this updating scheme is the use of 
asynchronous updating, in which the interface boundaries are 
updated with any available data from adjacent blocks. That 
data can be new or old, depending on the current stage of the 
multigrid process in the adjacent block. The implementation 
of the asynchronous updating in the vertical mode has been 
done by using the surface arrays; any time an update of the 
interface boundaries is required, the most recently available 


data from the surface array of the adjacent block will be read 
into the layers of dummy cells of the block. 

Using these surface arrays, advancing the solution one time 
step involves the following steps: 1) Update interface bound- 
aries by reading in data from the surface arrays of the adjacent 
blocks. 2) Advance the solution one time step. 3) Write out the 
solution in the inner layers into the block surface array. 

A similar sequence of steps is taken in the interpolation step 
of the multigrid cycle. 

The order in which the blocks are updated dictates which of 
the blocks will use surface arrays with updated data and which 
ones will use old data. It is worth noting that the order of 
updating will affect the intermediate solution on the way to a 
converged solution (e.g., a symmetric solution may develop 
asymmetries during the iteration, even though it eventually 
converges to a symmetric solution). The present code allows 
for different combinations of updating interface boundary 
conditions at different stages in the multigrid cycle. The ef- 
fects of these options will be discussed later. 

Implementation on Parallel Computers 

Most physical problems may have some level of inherent 
parallelism. Two common forms of parallelism are spatial and 
functional. In spatial parallelism, each subdomain interacts 
weakly with its neighbors and can be assigned to a different 
processor for updating; information is only occasionally ex- 
changed between the domains. In functional parallelism, the 
physical phenomena can be represented by different model 
equations having different time or length scales; e.g., viscous 
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Fig. 4 Comparison of measured and calculated pressure coefficients: 
= 0.839; a = 3.06 deg. 
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effects taking place in a confined boundary layer or a chemical 
reaction in a multicomponent flow. 

In each of these forms, parallelism can be exploited on 
different levels: 1) fine-grained parallelism at the do-loop 
level, and 2) coarse-grained parallelism at the subroutine level. 
In the case of fine-grained parallelism, each task (a discrete 
section of computational work to be completed) is relatively 
small, requiring more frequent communication, more fre- 
quent synchronization, and greater overhead expenses. For 
coarse-grained parallelism, the tasks are relatively larger, and 
so more computational work is done between synchroniza- 
tions, but more programming effort usually is required to 
implement the parallelism. 

The BDIM algorithm has an inherent spatial parallelism on 
at least two levels. Since it is based on domain decomposition, 
all blocks can be updated concurrently, with exchange of 
boundary data at specific times. In the vertical mode, each 
task can be the execution of an entire multigrid cycle in each 
block, whereas in the horizontal mode, a task might be the 
execution of one part of the multigrid cycle in each block (e.g.. 


Pressure contours 



ONERA WING M6(0 BLOCK.Vert,msurf= 1) 


Minimum = .3821E+00 

a) Maximum = .1492E+01 Upper Surface 
Pressure contours 



ONERA WING M6(8 BLK.Horiz.fbc = - l.UnSyn) 


Minimum = .3821E+00 

b) Maximum = .1492E+01 Upper Surface 

Fig. 5 Plan views of constant pressure contours on upper wing sur- 
faces: a) vertical mode; b) horizontal mode (M a = 0.839; a = 3.06 
deg). 
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Fig. 6 Lines of constant pressure, 12 blocks: M m =0.839; a = 3.06 
deg. 

compute corrections or interpolate corrections to a finer grid). 
This is a coarse-grained level of parallelism, in which the 
number of tasks is of the order of the number of blocks. On 
the fine-grained level, many operations can be done concur- 
rently. For example: the calculation of the residual in each cell 
is independent of the others and can be done concurrently, the 
line sweeps in the alternating-direction implicit (ADI) al- 
gorithm can be done concurrently, the solution of the five 
decoupled pentadiagonal systems on each line can be done 
concurrently, etc. On this fine-grained level, the number of 
tasks can be of the order of the number of cells in the entire 
domain. 

The present implementation has used the parallel extension 
of Fortran (PF) 16 on the IBM 3090-600J. This parallel archi- 
tecture has six processors, each with a vector facility and 
access to a shared memory. The language extension allows for 
fine-grained parallelism by invoking a compiler option that 
generates a parallel code for any do-loop found eligible and 
economical (i.e., when the solution remains the same and the 
code is predicted to be executed faster). The compiler allows 
for the nesting of parallel, scalar, and vector loops within a 
parallel loop and attempts to find the most efficient available 
combination. The coarse-grained level is implemented by exe- 
cuting each step in the multigrid cycle concurrently. This is 
done using the explicit mode of parallelization available in 
parallel Fortran; a set of parallel Fortran tasks is originated 
[the number of tasks is the minimum (number of blocks, 
number of processors)], each having its own local storage and 
subprograms and the ability to share memory with other tasks. 
Each time a do-for-all-blocks loop is encountered, each task 
is dispatched to perform work (i.e., the execution of one 
iteration of the loop). If more than one real processor is 
available, the tasks can be mapped onto different processors 
and the jobs executed in parallel. A wait statement is provided 
for synchronization since all tasks or loop iterations must have 
finished before the next step or loop can be executed. For 
more details on PF and its execution on the IBM 3090 see Ref. 
17. 

Results 

The algorithm just described has been applied to the prob- 
lem of transonic flow past wings. Results have been obtained 
for the transonic flow past an ONERA M6 wing for a variety 
of freestream conditions to verify the accuracy of the al- 
gorithm. Surface pressure distributions will be presented first, 
followed by a comparison of convergence rates for the two 
modes of multigrid, and a discussion of the effects of block 
boundary updates on the solution and convergence rates. Fi- 
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ONERA WING M6(8 BLOCK.Vert,msurf=l ) 
Fig. 7 Convergence history: vertical mode, single grid. 


nally, results from the parallel implementation of the code will 
be discussed. 

The grid used for these calculations is a C-grid containing 
192 x 32 x 32 mesh cells in the wraparound, normal, and 
spanwise directions, respectively. The grid was generated by 
stacking two-dimensional C-type grids in the spanwise direc- 
tion to produce a three-dimensional grid. The two-dimen- 
sional C-type grid was generated by a weak shearing of a 
square root transformation about a point just inside the lead- 
ing edge of the wing surface in each plane of constant z . The 
distribution of mesh cells is such that for each x - y plane 2/3 
of the cells in the wraparound direction are on the wing 
surface and 3/4 of the C-type sections in the spanwise direc- 
tion intersect the wing. The far-field boundaries are located 
approximately 10 chords upstream and downstream of the 
wing, approximately 19 chords laterally from the wing, and at 
approximately 3.5 semispan from the plane of symmetry in the 
z direction. This global grid was artificially divided into eight 
blocks: four blocks containing the wing and its wake and four 
blocks containing the domain outboard of the wing toward the 
spanwise far field (blocks 1-4 around the wing and 5-8 out- 
board of the wingtip). This division resulted in two sets of 
blocks containing 32 x 32 x 24, 64 x 32 x 24, 32 x 32 x 8, 
and 64 x 32 x 8 grid points; the ratio of the number of cells in 
the largest block to that in the smallest block is 6:1. The 
interface boundaries in each spanwise plane lie along the wing- 
normal lines leaving the leading and trailing edges of the 
airfoil and along the cut downstream of the airfoil trailing 
edge. 

The calculations, to be presented here, have been performed 
on an IBM 3090-600J under AIX, for the serial code, and 
under CMS for the parallel code. To confirm the accuracy of 
the vertical mode, results have been calculated for a 
freestream Mach number of 0.839 and 3.06-deg angle of at- 
tack in order to allow comparison with existing wind-tunnel 
test data. Figure 4 presents a comparison with wind-tunnel 
data 18 at four spanwise stations. The Reynolds number based 
on the mean aerodynamic chord Re c for the wind-tunnel test is 
approximately 1 1 .7 x 10 6 . The calculated results predict quite 
accurately the strengths and the locations of the shocks for 
this flow condition in spite of the neglect of viscous effects in 
calculation. 

Figure 5 presents contours of constant pressure on the upper 
surface of the wing for both the horizontal mode and the 
vertical mode. The development of the lambda shock pattern 
on the wing upper surface, characteristic of supercritical flows 
past swept wings, is clearly visible and the solutions are identi- 
cal. 


To examine the effects of an interblock boundary in a 
region of large gradients, the set of blocks containing the wing 
and the wake was further divided into two sets of blocks, 
having a common interface at about the 70% semispan of the 
wing. Contours of constant pressure for this case (12-block 
configuration) are presented in Fig. 6. It is clear that there is 
no visible effect on the shock structure caused by its crossing 
the interblock boundary (indicated by the solid line at the 70% 
semispan). The force coefficients for this case and the previ- 
ous one (eight blocks) are exactly the same. The accuracy of 
the solution in the vicinity of the interface boundaries is a 
direct result of the introduction of the second layer of dummy 
cells, which eliminates the need for approximating the fourth- 
order dissipation terms at the boundaries. This is in contrast to 
the increased thickness of the shock across the interface de- 
scribed by Atkins, 6 which resulted from his approximation of 
the dissipation terms on the interblock boundary. The present 
results also agree exactly with the calculations done using the 
horizontal mode.' 5 

Convergence rates discussed in the following are for the 
ONERA M6 wing at a freestream Mach number of 0.839 and 
3.06-deg angle of attack. The calculations were performed 
using a sawtooth multigrid cycle with grid sequencing, starting 
with the undisturbed flow as the initial guess on the coarsest 
grid. In this strategy, four levels of grids have been used; 100 
multigrid work units have been performed on each level, using 
the interpolated final solution of the coarser grid level as the 
initial solution on the next finer grid. The additional work on 
the first three grids required for this grid sequencing was about 
14% of that on the final grid (for 100 work units on the final 
grid). Figures 7 and 8 present convergence histories on the 
finest grid for the block scheme in vertical mode without and 
with multigrid, respectively. The plotted variables are the log- 
arithm of the average over all of the grid cells of the residual 
of the continuity equation I Ap/At I , the total number of grid 
cells in which the local Mach number is supersonic, and the lift 
and drag coefficients as a function of work units (WU). The 
latter three quantities are plotted on normalized scales, and 
one WU is the amount of computational work required for 
one time step on the fine grid. (One multigrid cycle requires 
slightly less than 8/7 WU for the sawtooth cycle used here). 
The effect of multigrid on the convergence rates is clear; using 
four levels of multigrid, the lift coefficient (as a measure for 
global convergence) reaches its final value in fewer than 40 
WU, whereas without multigrid, more than 150 WU are re- 
quired; with multigrid the error was reduced four orders of 
magnitude in 150 WU, whereas without multigrid a reduction 



ONERA WING M6(12 BLOCK, Vert.msurf = 1 ) 

Fig. 8 Convergence history: vertical mode, four-level multigrid. 
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in error of only two orders of magnitude was achieved with the 
same number of WU. Figure 9 presents the convergence rates 
for the horizontal mode for the same test case. It is clear that 
the two modes have very similar convergence characteristics. 
This is in contrast to our earlier two-dimensional results pre- 
sented in Ref. 12, where the boundary conditions on the block 
interfaces were frozen during the entire multigrid cycle, result- 
ing in degradation of convergence rate for the vertical mode. 

The effect of altering the order in which the blocks are 
solved on the accuracy and stability of the method has been 
examined by creating a random ordering of the blocks in each 
multigrid cycle (in contrast to a fixed consecutive ordering). 
The final results have been found to be exactly the same as for 
the fixed sequence and the differences in convergence rates 
were negligible. 

As mentioned in the description of the multigrid cycle, the 
updating of the interface boundary conditions is done at two 
times during the cycle; before advancing the solution one time 
step and before interpolating the corrections to the next finer 
grid. The effect of the updating has been investigated by 
turning off the updating, when on the coarse grids, before the 
interpolation step and at both steps altogether. It was found 
that when no updating of the boundary conditions is done on 
the coarse grids, the solution diverges (a behavior consistent 
with the two-dimensional results), but when the updating is 
turned off only in the interpolation step, no visible effect on 
convergence can be found. This suggests that, in parallel exe- 
cution, the overhead incurred by the locking mechanism at the 
updating step can be avoided by freezing the boundary condi- 
tions during the interpolation processes. 

We now turn to the results of the parallel computations. 
Since most of the execution time is spent in the multigrid cycle, 
and each block can execute its cycle independently of the other 
blocks, this part of the code has been explicitly executed in 
parallel. This was accomplished using the parallel task and 
parallel lock options in the parallel extension of Fortran on the 
IBM 3090-600J. The same test case (eight block) has been 
executed in parallel, using up to six processors at a time. The 
final solution, after 75 WU, has been found to be exactly the 
same as for the serial execution, independent of the number of 
processors. The convergence rates were almost identical to the 
rates of the serial runs as would be expected based on the 
experiment with the random ordering of the blocks in serial 
runs. 

The parallel performance of the code can be evaluated as 
follows. The maximum theoretical speedup 5 the or expected, 
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Fig. 9 Convergence history: horizontal mode, four-level multigrid. 
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Fig. 10 Parallel speedup for vertical mode. 


ignoring the overhead required to manage the parallel tasks, is 
given by Amdahl’s Law: 


5 '. hri'.T 


1 

l-P + P/N 


where P is the percentage of work performed in parallel, and 
N the number of processors. 

In the case where the overhead V associated with the parallel 
tasking is included, the modified theoretical speedup is given 
by 


1—P + (P/A0(1+ V) 

The actual speedup is given by 

^ serial CPU time 
acl wall clock time 


The efficiency of the parallel execution can be defined as 

„ <^act 

E ~ e 
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Figure 10 presents speedup results for the eight-block con- 
figuration using up to six processors. It has been found that 
the parallel overhead V is almost negligible, except for the 
cases in which six processors have been used, for which 
K = 2.8%. This is similar to the overhead observed for the 
horizontal mode. 15 It is clear that the parallel speedups 
achieved by the code are very good when using up to three 
processors but reduced significantly for any number of proces- 
sors greater than three. This observation can be explained by 
looking at the way in which the different blocks are distributed 
between the processors and the ratio of largest to smallest 
blocks. For example, in the eight-block decomposition, the 
total work to be done between synchronization points consists 
of 24 units (taking the smallest block to be equal to 1 unit). 
When using six processors, the time to process the two largest 
blocks (with a size of six units each) is the limiting factor in 
speedup, i.e., 24/6 = 4; it is similar for five processors. It can 
be seen that the actual speedup achieved for five and six 
processors is, in fact, less than or equal to 4. In Ref. 19, it has 
been shown that improved load-balancing results in better 
speedups for the horizontal mode; the same holds for the 
vertical mode. The reason for lower efficiencies when using a 
large number of processors is most probably due to system 
overhead and limited CPU resources when running in a pro- 
duction environment on a multiuser machine. The results pre- 
sented here have been obtained for the case in which the 
interblock boundary conditions are not updated at the inter- 
polation step. Updating the interblock boundary conditions at 
the interpolation step resulted in no significant difference in 
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parallel performance. This suggests that the amount of time 
spent in the updating process, which causes locking of some 
data and increases overhead, is negligible compared to the 
time spent in advancing the solution. This might change when 
a larger number of blocks is used and the ratio of block 
surface area to block volume increases. 

Conclusions 

A multigrid diagonal implicit algorithm has been developed 
to solve the Euler equations of inviscid compressible flow on 
block-structured grids. An improved version of the vertical 
mode of advancing the multigrid cycle has been examined. In 
this version, the use of buffer arrays allows for asynchronous 
updating of interface boundary conditions on coarse grids, 
thus eliminating the convergence problems encountered when 
the boundary conditions were frozen throughout the cycle. 
Results for transonic flows past wings verify the accuracy of 
the new method, in particular, the fact that no spurious errors 
have been introduced at the interblock boundaries or due to 
the asynchronous updating of the boundary conditions. It is 
also demonstrated that the new version of the vertical mode of 
the multigrid exhibits the same convergence characteristics as 
the horizontal mode, but without the need for frequent syn- 
chronization required in the horizontal mode. The algorithm 
has been implemented on a parallel computer, and speedups 
approaching the theoretical ones have been obtained when 
using a modest number of processors. 
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1 Introduction 

The challenge to develop algorithms to simulate accurately flows over re- 
alistic aerodynamic configurations has led to the exploration of block- 
structured grids. This approach substantially simplifies the task of grid 
generation for complex geometries and provides a natural framework for 
parallel computing. This approach also facilitates the use of different flow 
models in different blocks according to the physical characteristics of the 
flow, and the same applies for efficient grid refinement strategies. 

The division of the flow domain into blocks creates new interface bound- 
aries. The behavior of grid lines at the interfaces is an important character- 
istic that differentiates the methods which fall in this class. The approach 
used by Yadlin and Caughey (1991) to develop a block-structured multigrid 
algorithm for the Euler equations is restricted to interfaces with complete 
continuity. This restriction simplifies the treatment of inter-block bound- 
aries within the flow solver, but reduces the flexibility of the block grid 
generation technique. In order to fully demonstrate the power of block 
grid approaches, the present work is aimed at developing more general 
interface schemes to extend the multi-block Euler solver to more general 
block-structured grids having discontinuities in grid density and spacing. 

The construction of stable, accurate and conservative interface schemes 
has been addressed in numerous papers in recent years. For example, Rai 
(1986) has described a general patched grid interface condition for upwind 
schemes, and Berger and Jameson (1985) have discussed the importance 
of treating the inter-block boundaries conservatively. It is clear that the 
interface treatment depends upon the nature of the inter-block boundaries 
and the numerical algorithm used to solve the governing equations. In the 
present work, a fully conservative interface scheme, which permits the pas- 
sage of discontinuities across block boundaries with minimum distortion of 
the solution, has been developed. The scheme is based upon a cell-centered 
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2 Block-Structured Grids with. Discontinuous Interfaces 

finite volume method with a multigrid implementation of the Alternating 
Direction Implicit (ADI) algorithm. Results demonstrate the feasibility of 
using the multi-block approach with discontinuous grids to solve complex 
flow problems having discontinuous solutions. 

2 Description of Basic Algorithm 

In general curvilinear coordinates the two-dimensional Euler equa- 

tions can be written in strong conservation law form as 



dW 

8F 

dG 0 

(2.1) 
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F - h{pU, pUu+£ x p, pUv+Zyp, (e+p)U} T , 
G = h{pV, pVu+rjxP, pVv + ri y p, (e+p)V} r , 


and p — (7 — 1) {e — p(u 2 + u 2 )/2}. Here, p is the density, u and v are 
the velocity components in the Cartesian coordinates (2,2/), e is the total 
energy per unit volume, p is the pressure, h is the determinant of the Jaco- 
bian of the transformation, and U and V are the contravariant components 
of the velocity. 

The present finite-volume method, following Jameson, Schmidt and 
Turkel (1981), defines the dependent variables p,pu,pv , e and p as cell 
averages. The value of any variable on each cell face is taken to be the 
average of the values in the cells sharing the face. The spatial derivatives 
are approximated by evaluating the net flux across the faces of each mesh 
cell using constant values of the fluxes on each face. 

In order to prevent decoupling of the solution at alternate cells in the 
grid, dissipative terms must be added. These are constructed as an adaptive 
blend of second- and fourth-differences of the solution in each of the mesh 
directions. With the added dissipation, the difference approximation of Eq. 
(2.1) is written as 


+ Qw i:] Dw i:j =0, (2.2) 

where Q is a flux operator, and D is a dissipative operator defined as 

Dwij = SziefSt - + M4 2) £, - (2-3) 

# 9 \ 

where 6( and 6 n are central difference operators. The coefficients e^' and 
are adapted to the solution as described by Caughey (1988). 
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Once discretized, the equations are integrated to the steady state using 
a diagonal ADI method (Caughey 1988). Local time stepping, successive 
grid refinement and the multigrid method are used to accelerate the con- 
vergence. In the present implementation, the multigrid cycle is advanced 
concurrently in all the blocks. For two dimensional problems, a complete 
description of the diagonal implicit multigrid algorithm for a single block 
is given by Caughey (1988), and the extension to the multi-block case on 
smooth grids is given by Yadlin and Caughey (1991). 

3 Interface Scheme 

The physical domain is sub-divided into logically rectangular blocks in 
the computational domain. Each block, with its own coordinate system 
in the computational space, has four faces. Each face can correspond to 
one of several types of boundary conditions: a solid surface, a far field or 
an interface. The treatment of the boundary conditions on solid surfaces 
and in the far field is the same as in Caughey (1988). The ADI algorithm 
requires implicit boundary conditions on all four faces of each block. These 
implicit boundary conditions are treated in a manner consistent with the 
characteristic theory. 

On the inter-block boundaries, special treatment is needed to transfer 
information accurately between the blocks. In the present implementation, 
the grid points on each side of the interface are not necessarily identical, nor 
is the grid spacing across the interface necessarily continuous. In order to 
apply the cell-centered finite volume discretization to the cells on both sides 
of the interface, it is necessary to compute the inviscid fluxes across the cell 
faces lying on the interface. This can be done as follows. First, one of the 
two adjoining blocks is designated as Block 1, and the other as Block 2, (see 
Fig. 1). Then, the dependent variables at the midpoints of cell faces lying 
on the interfaces of Block 1 are interpolated from three cell centered values 
using bilinear interpolation. The three cell centers involved are chosen such 
that the midpoint of the cell face considered lies inside the triangle formed 
by these three points. The corresponding fluxes across the cell faces on the 
interface for Block 1 can then be determined in the standard manner. The 
advantage of this approach is that information is required only from the two 
layers of cells separated by the interface; the computer memory required 
for bookkeeping is thus reduced. In order to preserve global conservation, 
the fluxes through the cell faces on the interface for Block 2 are determined 
by satisfying the conservation condition at the interface, which requires the 
discrete line integral of the numerical flux along both sides of any portion 
of the inter-block boundary be the same. The conserved fluxes for each 
face of Block 2, are thus calculated using the sum of the fluxes through 
each of the segments of Block 1 which lie between the two end points of 



4 Block-Structured Grids with Discontinuous Interfaces 

the cell face of Block 2. For example, in Fig. 1, the flux through cell face 
A of Block 2, which runs from point (j — to point (j + 1), is the sum of 
the fluxes through segments a, b and c of Block 1. 

The easiest way to keep the dissipative fluxes conservative is to set the 
first- and third-difference dissipative fluxes to zero at interfaces. If there 
is no shock passing through the interface, the solution is almost unaffected 
by this approach. If there is strong shock crossing the interface, setting the 
first-difference fluxes equal to zero will result in an inaccurate or divergent 
solution. In the present implementation, the third-difference fluxes at the 
interfaces are set to zero, but the first-difference fluxes are treated the same 
as the inviscid fluxes at the interfaces. The required first difference at cell 
faces on the interface of Block 1 can be formed using linear combinations 
of the values at the last cell centers and at the midpoints of cell faces on 
the interface. For Block 2, the quantities are calculated in exactly the same 
way as are the inviscid fluxes. In this way, both inviscid and dissipative 
fluxes are kept conservative across the interface. 


4 Results 

The algorithm described above has been applied to compute transonic flows 
past the NACA 0012 airfoil. Results have been obtained for several types of 
grids and free stream conditions to verify the accuracy and functionality of 
the method. The convergence histories of the multi-block implementation 
are virtually identical to those for corresponding single block grids in the 
several cases for which comparisons have been made. 

The first test case presented here is designed to verify the accuracy of 
the interface scheme on discontinuous grids, and to examine the effects of a 
discontinuous block boundary in the vicinity of a large gradient. The free 
stream Mach number is 0.875 with 0° angle of attack. The grid around the 
airfoil is patched together using six blocks, as shown in Fig. 2. The grid dis- 
tribution is clearly discontinuous across the interface boundaries. Contours 
of constant pressure for this case are presented in Fig. 3. No discontinuities 
in the slopes of the contours are observed, and the discontinuous inter-block 
boundary has no visible effect on the shock structure. 

In the second test case, the present blocked grid approach is used to 
allow efficient grid refinement. Flow regions requiring higher resolution 
can be isolated in separate blocks and the required grid refinement can be 
introduced in these blocks. This test case has a free stream Mach number of 
0.8 with 1.25° angle of attack. The flow field is divided into 18 blocks with 
refined blocks near the leading edge and in the regions where shocks are 
expected to appear in the solution, as shown in Fig. 4. The results verify 
that the present method can generate solutions of accuracy comparable to 
those obtained on a single block, uniformly fine grid (384 x 64), by using 
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locally refined grid blocks. The contour plots of pressure shown in Fig. 5 
demonstrate the continuity of the solution across block boundaries. 


5 Conclusion 

A multi-block approach which permits grid discontinuities at inter-block 
boundaries has been developed for the two dimensional Euler equations. 
The interface schemes are designed for the cell-centered finite volume ADI 
method, but could be applied to explicit or other implicit methods as well. 
Results have demonstrated the accuracy and functionality of the method. 
The method is ready to be extended to three dimensional problems. 
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Two implicit multigrid algorithms for the two and three dimensional com- 
pressible Euler equations have been developed in this dissertation. 

First, a diagonal implicit multigrid method is developed for solving a 
finite- volume approximation to the Euler equations in which the dependent 
variables are stored at the cell vertices. The spatial derivatives in the two di- 
mensional Euler equations are approximated using a conservative cell- vertex 
finite volume formulation. Artificial dissipation is provided by adding an 
adaptive blend of second and fourth differences of the solution to maintain 
stability and accuracy. A Diagonal Alternating Directional Implicit method 
is used to advance the solution in time. Rapid convergence to a steady-state 
solution is achieved with local time stepping and the multigrid algorithm. 
Results for the transonic flow past the NACA 0012 airfoil are presented to 
verify the accuracy and efficiency of the scheme. 

Second, the development of an efficient and flexible multi block/multigrid 
Euler solver and its application to realistic engineering problems are pre- 
sented. A cell-centered finite volume method with a multigrid implementa- 
tion of the Diagonal Alternating Direction Implicit algorithm is used to solve 


the Euler equations. A fully conservative inter-block boundary condition, 
which permits the passage of discontinuities across block boundaries with 
minimum distortion of the solution, is developed for cases in which the grid 
lines at the inter-block boundaries can be completely continuous or discontin- 
uous. Information is exchanged between blocks by using surface arrays, which 
contain all the data needed to update the inter-block boundary conditions. 
Results demonstrate the feasibility of using the present multi-block/multigrid 
approach to solve flow problems involving complex geometries. Two dimen- 
sional results for several types of grids and various free stream conditions 
have been presented to verify the accuracy and computational efficiency of 
the method. The application of the multiblock approach as a means to per- 
form efficient grid refinement has also been demonstrated. Calculated results 
for three dimensional external and internal flow fields surrounding a highly 
contoured super-elliptic diffuser inlet are presented to demonstrate the accu- 
racy and functionality of the present multiblock/multigrid methodology. 
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Abstract 

The development of an efficient and flexible 
multiblock/multigrid Euler solver and its applica- 
tion to realistic engineering problems are described 
in the present paper. A cell-centered finite volume 
method with a multigrid implementation of the 
Alternating Direction Implicit (ADI) algorithm is 
used to solve the Euler equations. A fully conserva- 
tive inter-block boundary condition, which permits 
the passage of discontinuities across block bound- 
aries with minimum distortion of the solution, is 
developed for cases in which the grid lines at the 
inter-block boundaries can be completely contin- 
uous or discontinuous. Information is exchanged 
between blocks by using surface arrays, which con- 
tain all the data needed to update the inter-block 
boundary conditions. Results demonstrate the fea- 
sibility of using the present multi-block/multigrid 
approach to solve flow problems involving com- 
plex geometries. Two dimensional results for sev- 
eral types of grids and free stream conditions have 
been presented to verify the accuracy and compu- 
tational efficiency of the present method. The ap- 
plication of the present multiblock approach as a 
means to perform efficient grid refinement has also 
been demonstrated. Calculated results for three 
dimensional external and internal flow fields sur- 
rounding a highly contoured super-elliptic diffuser 
inlet are presented to demonstrate the accuracy 
and functionality of the present method. 

I. Introduction 

Considerable progress has been made over the last 
two decades in developing computational fluid dy- 
namics (CFD) methods for aerodynamic applica- 
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tions. Recent work in CFD has concentrated pri- 
marily on developing algorithms for the solution 
of the Euler and Navier-Stokes Equations and on 
applying these algorithms to increasingly complex 
aeronautical configurations. A major obstacle in 
solving flow problems over complex geometries is 
the generation of smoothly varying meshes about 
such configurations. In order to overcome this ob- 
stacle, two basic approaches have been proposed. 
The first is to employ completely unstructured 
meshes as in the methods described by Jameson 
et al [1], Billey et al [2], and Peraire et al [3]. The 
meshes are composed of triangles in 2D and tetra- 
hedrons in 3D connecting a random distribution of 
points. The second approach involves the use of 
multiblock structured grids as in the methods pre- 
sented by Thompson [4] , Weatherill and Forsey [5], 
Thomas [6], and Karman et al [7]. This approach 
divides the given flow domain into sub-domains, 
in each of which can be generated independently 
a simple structured mesh of quadrilateral cells in 
2D or hexahedral cells in 3D. Both these methods 
have produced impressive demonstrations of their 
capabilities. 

The aim of our present research is to develop 
an efficient algorithm to simulate two and three 
dimensional transonic flows around complex ge- 
ometries. To this end, a multiblock approach is 
adopted here. This approach can substantially 
simplify the task of grid generation for complex 
geometries and also provides a natural framework 
for parallel computing. This approach can also fa- 
cilitate the use of different flow models in different 
blocks according to the physical characteristics of 
the flow, and the same can apply for efficient grid 
refinement strategies. 

The division of the flow domain into blocks cre- 
ates new inter-block boundaries between contigu- 



ous blocks. The behavior of grid lines at the 
inter-block boundaries is an important character- 
istic that differentiates the methods which fall in 
this class. Grid lines from the two adjoining re- 
gions may meet with complete continuity, with or 
without slope continuity, or may not even meet. 
A more general type of interface uses overlapping 
grids as in the Chimera method of Benek et al [8] . 
While the overlapping grid approach provides even 
greater flexibility in mesh generation, it is more dif- 
ficult to communicate between the blocks and to 
maintain global conservation. Thompson [9] has 
provided a survey of various grid concepts, gener- 
ation methods and related issues. 

Various Euler solvers have been developed us- 
ing multiblock approaches, and the construction of 
stable, accurate and conservative interface schemes 
has been addressed in numerous papers in recent 
years. For example, Hessenius and Pulliam [10] 
employed a conservative upwind flux vector split- 
ting scheme at interior inter-block boundaries for 
the Beam- Warming [11] implicit integration pro- 
cedure. Rai [12] has described a general patched 
grid interface condition for upwind schemes, and 
Berger and Jameson [13] have discussed the impor- 
tance of treating the inter-block boundaries con- 
servatively. In general, these methods differ from 
one another in several aspects, such as upwind 
vs. central differencing procedure, explicit vs. im- 
plicit integration process, cell-centered vs. cell- 
vertex scheme, conservative vs. non-conservative 
formulation, convergence acceleration technique, 
patched grid vs. overlapping grid approach. 


It is clear that the interface treatment depends 
upon the nature of the inter-block boundaries and 
the numerical algorithm used to solve the govern- 
ing equations. Wang and Caughey [15] developed 
a fully conservative interface scheme, which per- 
mits the passage of discontinuities across block 
boundaries with minimum distortion of the solu- 
tion. The scheme was based upon a cell-centered 
finite volume method with a multigrid implemen- 
tation of the Alternating Direction Implicit (ADI) 
algorithm. In the present work, the fully con- 
servative interface scheme is extended to three- 
dimensional problems. Three significant features 
of the present work are the following: (1) The gen- 
eral specification of the boundary conditions at the 
block boundaries combined with a fully conserva- 
tive interface scheme greatly enhances the power 
of the multiblock approach; (2) Efficient local grid 
refinement can be performed by using the present 
multiblock approach; and (3) Only a relatively lim- 
ited modification of the input data is needed when 
switching from one application to another. 

The present multiblock/multigrid method has 
been applied to compute two dimensional and 
three dimensional flows. Results of two dimen- 
sional transonic flows past the NACA 0012 airfoil 
have been obtained for different types of grids and 
free stream conditions. The three dimensional ex- 
ternal and internal flow through an engine inlet 
is also presented here. Results demonstrate the 
ability of the present multiblock methodology to 
compute flows about complex geometric configu- 
rations. 


The present work is an extension of a block- 
structured multigrid algorithm for the Euler equa- 
tions developed by Yadlin and Caughey [14]. In 
their work, the inter-block boundaries were re- 
stricted to those having complete continuity, which 
means that continuity conditions are imposed on 
grid points, grid spacing and grid line orientation 
at grid interfaces between adjoining blocks. This 
restriction simplifies the treatment of inter-block 
boundaries within the flow solver since it implies 
that cells at an inter-block boundary can be treated 
as interior grid cells, but greatly reduces the flex- 
ibility of the block grid generation technique. In 
order to fully realize the power of the multiblock 
approach, the present work is aimed at developing 
more general interface schemes to extend the multi- 
block Euler solver to more general block-structured 
grids, even those having discontinuities in grid den- 
sity and spacing. 


II. Description of Basic Algorithm 

The unsteady, three dimensional Euler equations 
can be written in conservation law form as 


dw df dg dh 

dt ^dx^dy dz ' 
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are the flux vectors in the x, y and z coordinates 
respectively. The variable p is the density, u, v 
and w are the velocity components in the x, y and 
z coordinate directions, respectively, e is the total 
energy per unit volume, and p is the pressure. For 
a perfect gas, the pressure is related to the total 
energy by the equation of state 
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where 7 is the ratio of specific heats and Ho is the 
total enthalpy. 


To allow the treatment of arbitrary geometries, 
Eq. 1 is transformed into the general curvilinear 
coordinates (£,* 7,0 and written in strong conser- 
vation law form as 
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are the transformed flux vectors. Here, J is the 
determinant of the Jacobian of the transformation 
(which corresponds to the cell volume) 
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and U, V, and W are the contravariant components 
of the velocity given by 
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The present finite-volume method, following 
Jameson, Schmidt and Turkel [16], defines the de- 
pendent variables p, pu, pv, pw, e and p as cell av- 
erages. The value of any variable on each cell face 
is taken to be the average of the values in the cells 
sharing the face. The spatial derivatives are ap- 
proximated by evaluating the net flux across the 
faces of each mesh cell using constant values of the 
fluxes on each face. 

In order to prevent decoupling of the solution at 
alternate cells in the grid, dissipative terms must 
be added. These are constructed as an adaptive 
blend of second- and fourth-differences of the so- 
lution in each of the mesh directions. With the 
added dissipation, the difference approximation of 
Eq. 3 is written as written as 


where 
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is the vector of transformed dependent variables, 
and 
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where Q is a flux operator, and D is a dissipative 
operator defined as 

+ M 4 a) *» ( 6 ) 

+ m4 2) *c 

where 6^ , 6 n and 6( are central difference operators. 
The coefficients and are adapted to the 
solution as described by Caughey [17]. 

Once discretized, the equations are integrated to 
the steady state using the diagonal ADI method. 


3 



Local time stepping, successive grid refinement and 
the multigrid method are used to accelerate the 
convergence. For two dimensional calculations, a 
complete description of the diagonal implicit multi- 
grid algorithm is given by Caughey [17], and the 
extension to the three dimensional flow is given by 
Yadlin and Caughey [18]. 

III. Multiblock/Multigrid Implementation 

The multiblock approach adopted here consists 
of a preliminary topological subdivision of the com- 
plete physical flow domain into a number of logi- 
cally rectangular blocks in the computational do- 
main. The present method assumes that the grids 
do not overlap, but share common boundaries. The 
choice of non-overlapping grids has an advantage 
with respect to communication between the blocks. 
In this case, only 2D data structures are required 
for the inter-block boundaries. This simplifies the 
logic and allows increased of efficiency. Although 
the inter-block surface must be common between 
the two neighboring blocks, there is no restriction 
on the grid slope or density across the block bound- 
ary. This allows highly complex geometries to be 
broken into a collection of simple blocks, each of 
which requires only simple grid generation tech- 
niques. 

Each block, with its own coordinate system in 
the computational space, has four faces in 2D or six 
faces in 3D. In the present implementation, it is re- 
quired that at each face, only one type of boundary 
condition can be applied. This often requires intro- 
ducing more computational blocks than necessary, 
but the handling of the boundary conditions is sim- 
plified. The information required at each face is as 
follows: (1) the block number; (2) the neighbor- 
ing block number and face number (if applicable); 
(3) the orientation of the face coordinate system; 
and (4) the boundary condition type. This infor- 
mation is stored in a set of 2-D integer arrays and 
used in the flow solver to steer the boundary condi- 
tion routines and link the interfaces to each other. 
These integer values can either be created as in- 
put by the grid generation code or specified by the 
user to select appropriate boundary conditions for 
a particular block structure. 

The multiblock approach is combined with 
multigrid in order to accelerate iterative conver- 
gence. Two options to implement the multigrid 
within the multiblock framework exist. The loop 
over the various blocks may be put inside or out- 
side the loop over the different grid levels. These 


are called horizontal mode and vertical mode, re- 
spectively. In the horizontal mode, the multigrid 
cycle is kept in phase in all the blocks; in the verti- 
cal mode, the multigrid cycle is advanced indepen- 
dently in each block. A discussion of the advan- 
tages and drawbacks of these two modes was given 
by Yadlin and Caughey [14,19]. In the present im- 
plementation , the multigrid cycle is advanced using 
the horizontal mode in the 2D case and the vertical 
mode in the 3D case. 

Boundary Conditions 

The aim of this work is to develop an efficient 
method for calculating flows over complex geome- 
tries without the need to modify the computer pro- 
gram for each new geometry. In the present imple- 
mentation, general specification of boundary con- 
ditions on the four/six computational faces is al- 
lowed for calculating either external flow around 
an airfoil/wing or internal flow through a cas- 
cade/inlet. Any of several different types of bound- 
ary conditions can be applied on each boundary 
face: solid surface, symmetry plane, far field of 
external flow, inflow/outflow of interned flow, peri- 
odic boundary or interface boundary. 

At a solid surface, the mass flux is zero. Only the 
pressure on such a face contributes to the momen- 
tum flux balance. For inviscid flow, a symmetry 
condition is equivalent to a solid surface boundary 
condition. 

For external flow in the far field, the Riemann in- 
variants for the one dimensional flow normal to the 
boundary are used for inflow and outflow bound- 
aries. At an outflow boundary point, the tangen- 
tial velocity component and entropy are extrapo- 
lated from the interior, while at an inflow boundary 
point they are specified at their free stream values. 
The detailed treatment of explicit boundary con- 
ditions on solid surfaces and in the far field for 
external flow are described by Caughey [17]. 

For internal flow, the inflow boundary condition 
is the same as the inflow condition for external 
flow. On an outflow boundary, only the pressure 
is specified, while entropy and the three velocity 
components are extrapolated from the interior of 
the flow field. For periodic boundaries, the treat- 
ment is the same as an interface boundary, which 
will be discussed later. 

The ADI algorithm requires implicit boundary 
conditions on all four/six faces of each block. 



These implicit boundary conditions are treated in 
a manner consistent with the characteristic the- 
ory. At each face, the appropriate eigenvalues are 
calculated and used to determine the directions of 
the characteristics for the one-dimensional problem 
normal to the boundary. The boundary conditions 
for the corrections to those elements of the solution 
vector which correspond to characteristics entering 
the domain are taken to be homogeneous Dirichlet 
conditions, while the boundary conditions on the 
corrections to those elements which correspond to 
characteristics leaving the domain are taken to be 
homogeneous Neumann conditions. 

Interface Scheme 

In the original code developed by Yadlin and 
Caughey [14], the grid lines at the interfaces are 
continuous across block boundaries. In order to 
treat boundary grid points as interior grid points, 
two extra layers of dummy cells are added outside 
each block. In the basic algorithm, the dissipa- 
tive terms include fourth differences of the solu- 
tion, hence their direct evaluation would require 
two extra layers of dummy cells. In the present 
implementation, the grid points on each side of 
the interface are not necessarily identical, nor is 
the grid spacing across the interface necessarily 
continuous. For this general type of inter-block 
boundaries, special treatment is needed to transfer 
information accurately between the blocks. 

In order to apply the cell-centered finite volume 
discretization to the cells on both sides of the inter- 
face, it is necessary to compute the inviscid fluxes 
across the cell faces lying on the interface. This can 
be done as follows. First, one of the two adjoin- 
ing blocks is designated as Block 1, and the other 
as Block 2. Fig. 1 shows a schematic inter-block 
boundary for the 2D case. The dependent vari- 
ables at the midpoints of the cell faces lying on the 
interfaces of Block 1 are interpolated from the cor- 
responding cell centered values using bilinear inter- 
polation. The cell centers involved are chosen such 
that the midpoint of the cell face considered lies 
inside the triangle for 2D formed by three points 
or the tetrahedron for 3D formed by four points. 
The corresponding fluxes across the cell faces on 
the interface for Block 1 can then be determined 
in the standard manner. The advantage of this ap- 
proach is that information is required only from the 
two layers of cells separated by the interface; the 
computer memory required for bookkeeping is thus 
reduced. In order to preserve global conservation, 


the fluxes through the cell faces on the interface for 
Block 2 are determined by satisfying the conserva- 
tion condition at the interface, which requires the 
discrete line/surface integral of the numerical flux 
along both sides of any portion of the inter-block 
boundary be the same. The conserved fluxes for 
each face of Block 2 are thus determined using the 
sum of the contributions from each of the segments 
of Block 1 which lie inside the cell face of Block 2. 
For example, in Fig. 1, the flux through cell face A 
of Block 2, which runs from point (j — to point 
(i+ |), is the sum of the fluxes through segments 
a, b and c of Block 1. 

By analogy with the treatment of inviscid fluxes 
at the interfaces, special interface formulas for the 
dissipative fluxes also must be provided. The eas- 
iest way to keep the dissipative fluxes conserva- 
tive is to set the first- and third-difference dissi- 
pative fluxes to zero at interfaces. If there is no 
shock passing through the interface, the solution 
is almost unaffected by this approach. If there is 
a strong shock crossing the interface, setting the 
first-difference fluxes equal to zero will result in 
an inaccurate or even a divergent solution. In the 
present implementation, the third-difference fluxes 
at the interfaces are set to zero, but the first- 
difference fluxes are treated in a manner similar 
to the inviscid fluxes at the interfaces. The re- 
quired first difference at cell faces on the interface 
of Block 1 can be formed using linear combina- 
tions of the values at the last cell centers and at 
the midpoints of the cell faces on the interface. For 
Block 2, the quantities are calculated in exactly the 
same way as are the inviscid fluxes. In this way, 
both inviscid and dissipative fluxes are kept con- 
servative across the interface. 

It is noted that since an approximation is made 
in the evaluation of the dissipative terms by this 
interface scheme, the solution at points near block 
boundaries is formally less accurate than at other 
interior points. The present code therefore allows 
either of two types of interface boundary conditions 
to be applied at the block surface: (1) a completely 
continuous interface; or (2) a general or discontin- 
uous interface, in which the solution at one bound- 
ary face is calculated by interpolation, while that 
at the pairwise one is calculated by integration. It 
is, of course, clear that the general interface condi- 
tion also applies for grids with complete continuity. 

The inviscid fluxes and dissipative fluxes cal- 
culated as described above at the interfaces for 
both adjoining blocks are stored in special sur- 



face arrays. Also stored in these surface arrays 
is the required geometric information at the inter- 
face, which includes: the pointers addressing the 
cell center points needed to interpolate the depen- 
dent variables at the midpoints of the cell faces of 
Block 1 and the interpolation coefficients needed to 
calculate the face flux for Block 2. These pointer 
and coefficient arrays can be set up at the mesh 
generation stage. The data structure of the sur- 
face arrays follows the usual multiblock/multigrid 
framework. These arrays are treated as one dimen- 
sional arrays, and organized by blocks. The data 
of the first block are stored at the beginning of the 
array, followed by those of the second block, and 
so on. Within each block, the data is organized by 
grid level. At each grid level, the data is organized 
by faces. The data stored in the surface arrays is 
accessed using a system of pointers, which are cal- 
culated according to the number of blocks and grid 
levels in the multigrid cycle and are stored in an 
integer array. 

IV. Results 

The algorithm described above has been applied 
to compute two dimensional and three dimensional 
flows. Results of two dimensional transonic flows 
past the NACA 0012 airfoil have been obtained 
for different types of grids and free stream condi- 
tions to verify the accuracy and functionality of the 
present method. The three dimensional external 
and internal flow through an engine inlet is also 
presented here to demonstrate the ability of the 
present multiblock methodology to compute flow 
about complex geometric configurations. 

2D Results 

The first two dimensional case is presented to 
compare the convergence rates of the multi-block 
implementation with those for corresponding sin- 
gle block grids. For this purpose, it is necessary to 
choose a case for which the flow can be calculated 
equally well using a single block method. The cal- 
culation is done for the flow over the NACA 0012 
airfoil at a free stream Mach number of 0.80 and 
1.25° angle of attack. The grid used in the cal- 
culation is a C-grid, containing 192 x 32 grid cells 
in the wraparound and normal directions, respec- 
tively. To test the multiblock calculations, the grid 
was artificially divided into three blocks, contain- 
ing 32 x32, 128 x 32 and 32 x32 grid cells, as shown 
in Fig. 2. It is obvious that the grid lines at the 
interfaces between Block 1 and Block 2 or between 
Block 2 and Block 3 are completely continuous. In 


order to test the present version of the multiblock 
code, the general type of the interface scheme is 
used in this calculation, instead of using the con- 
tinuous interface scheme (results of which were dis- 
cussed by Yadlin and Caughey [14]). The calcula- 
tions were performed using a saw-tooth multigrid 
cycle with grid sequencing, starting with the undis- 
turbed flow as the initial guess on the coarsest grid. 
In this calculation, 100 multigrid work units were 
first performed on grids containing 8 x 8, 32 x 8, 
and 8x8 cells in the three blocks, respectively, 
using three levels of multigrid. The solution was 
then interpolated onto grids containing 16 x 16, 
64 x 16, and 16 x 16 cells, and an additional 100 
work units were performed. Finally, this solution 
was interpolated into the fine grids and a final 200 
work units, using five levels of multigrid, were per- 
formed. Fig. 3 shows the convergence history on 
the finest grid for the present multiblock method 
in horizontal mode with five levels of multigrid. 
Four measures of convergence are plotted: the log- 
arithm of the average over all the grid cells of the 
residual of the continuity equation |Ap/At|, the 
total number of grid cells in which the local Mach 
number is supersonic, the lift coefficient Cj and 
the drag coefficient Cj as a function of computa- 
tional labor, measured in work units. The latter 
three quantities are plotted on normalized scales, 
and one work unit is the amount of computational 
work required for one time step on the fine grid. 
For reference purposes, a calculation has also been 
done using a single-block grid that is, in all other 
respects, equivalent to the multiblock method de- 
scribed above. Fig. 4 presents the convergence his- 
tory for the single block calculation. Comparing 
Figs. 3 and 4 shows that the convergence history 
of the multiblock implementation is virtually iden- 
tical to that for the corresponding single block grid. 
There is no significant degradation in the perfor- 
mance of the multiblock/multigrid method, even 
though the interface scheme for updating the solu- 
tion at points on block boundaries is formally less 
accurate than at other interior points. 

The second test case presented here is artificially 
designed to verify the accuracy of the interface 
scheme on discontinuous grids, and to examine the 
effects of discontinuous inter-block boundaries in 
the vicinity of a large gradient. The free stream 
Mach number is 0.875 with 0° angle of attack. The 
grid around the airfoil is patched together using six 
blocks, as shown in Fig. 5. Block 2, next to the air- 
foil, contains 144 x 16 grid cells. The grids in the 
other five blocks are coarser than those in Block 2, 



containing 16 x 12, 16 x 12, 16 x 12, 96 x 12, and 
16 x 12 cells, respectively. The grid lines across the 
interfaces between Block 1 and Block 4, Block 3 
and Block 6, Block 4 and Block 5, and Block 5 and 
Block 6, are completely continuous; while the grid 
distributions are clearly discontinuous across the 
interface boundaries between Block 1 and Block 2, 
Block 2 and Block 3, and Block 2 and Block 5. 
For the continuous inter-block boundary, the first 
type of interface scheme has been used and for the 
discontinuous inter-block boundaries, the general 
type of interface scheme has been used. Contours 
of constant pressure as well as the block boundaries 
for this case are presented in Fig. 6. Although there 
is a strong shock generated near the rear part of 
the airfoil, no discontinuities in the slopes of the 
pressure contours are observed, and the discontin- 
uous inter-block boundaries have no visible effect 
on the shock structure. 

In the third test case, the present multiblock ap- 
proach is used as a tool to perform efficient grid re- 
finement. Flow regions requiring higher resolution 
can be isolated in separate blocks and the required 
grid refinement can be introduced in these blocks. 
This test case has a free stream Mach number of 
0.8 with 1.25° angle of attack. The coarse global 
grid is a C-grid with 192 x 32 grid cells. Then, 
the flow field is divided into 18 blocks with refined 
blocks near the leading edge and in the regions 
where shocks are expected to appear in the solu- 
tion, as shown in Fig. 7. The block sizes vary from 
4 x 16 cells to 32 x 32 cells, with a total of 8,448 
grid cells. The contour plots of pressure shown in 
Fig. 8 demonstrate the continuity of the solution 
across block boundaries. Fig. 9 shows the surface 
pressure coefficient distributions calculated by the 
multiblock method on the 18-block grid. The lift 
coefficient is 0.3638, the drag coefficient is 0.0239. 
Fig. 10 shows the surface pressure coefficient distri- 
butions calculated by a single block method using 
a global grid with 384 x 64 grid cells. The lift coeffi- 
cient in this case is 0.3644, and the drag coefficient 
is 0.0234. It is evident that the present method, 
by using locally refined grid blocks, can generate 
solutions that agree well those obtained on a single 
block, globally refined grid, and the interfaces do 
not cause any problems even when they are crossed 
by shocks. However, a global refinement increases 
the total number of grid cells by a factor of 4, while 
the local refinement increases the number of cells 
by a factor of only 1.375. This fact illustrates the 
potential of local refinement and the power of the 
present multiblock methodology. 


3D Results 

The present multiblock/multigrid approach was 
applied to simulate the integrated internal and ex- 
ternal flow fields surrounding a highly contoured 
super-elliptic diffuser inlet. This geometry is a re- 
alistic engineering geometry and it is complicated 
enough to verify the functionality of the multiblock 
method. Chyu and Bencze [20] used a multiblock 
approach combined with a two-topology grid to 
represent both the internal and external flow fields 
surrounding this geometry, and used an implicit, 
approximately factored, partially flux-split finite- 
difference algorithm to solve the three dimensional 
thin-layer Navier-Stokes equations. Although good 
agreement was shown between the experiment and 
their computation, the convergence rate of their 
computation was very slow. The present calcula- 
tion solves only the Euler Equations; an effort to 
extend the present method to the Navier-Stokes 
equations is underway. 

The inlet geometry consists of a bell-mouth en- 
trance, a straight rectangular duct, and an offset 
diffuser with cross sectional profiles that vary from 
rectangular to circular, and a circular exit duct, as 
shown in Fig. 11. To treat the complexity of this 
geometry, a composite grid of three blocks with 
two types of grid topology was generated by Chyu 
and Bencze [20]. The three grid blocks consisted of 
an external grid which extended from the highlight 
of the bell-mouth to the far field boundary of the 
external flow and two internal grids referred to as 
the outer internal grid and the inner internal grid. 
An O-H topology was used for the external and 
outer internal grids, whereas to avoid the numeri- 
cal singularity associated with an O-grid along the 
centerline of the duct, an H-H topology was used 
for the inner internal grid. The internal grids are 
shown in Fig. 12 for typical rectangular, elliptical 
and circular sections of the duct. The grids (repre- 
senting only half of the geometry due to symmetry) 
consist of a total of 51, 200 cells including 48 x 16x8 
cells in the inner internal H-grid, 48 x 32 x 8 cells 
in the outer internal O-grid and 32 x 32 x 32 cells 
in the external O-grid. The grid points were clus- 
tered near the bell-mouth and in the offset region 
of the inlet where the flow variation is great in the 
axial direction. 

In the present implementation, the type of 
boundary condition must be the same for the entire 
extent of each face of a block, so it was necessary 
to decompose the original three blocks into four- 
teen blocks. The block sizes vary from 16 x 32 x 32 



to 128 x 64 x 32. Although the grid lines across 
the interface boundary are continuous, their slopes 
and coordinate orientations are not continuous at 
all interfaces. 

The calculations are done for a free stream Mach 
number of 0.5 and for two different exit static pres- 
sure ratios. Three successive grids have been used. 
Figs. 13 and 14 present convergence histories on 
the finest grid for the block scheme with three lev- 
els of multigrid, for pressure ratios of 1.11 and 1.08, 
respectively. The effect of the multigrid on the con- 
vergence rate is very clear. 

Figs. 15 and 16 show the computed Mach con- 
tours of the flow field in the symmetry plane of 
the inlet, and Figs. 17 and 18 show the computed 
Mach contours in the rectangular, elliptical and 
circular sections of the duct. The contour lines are 
smooth and continuous across the interface bound- 
aries. Since these results are obtained using the 
Euler equations, there is no result available for 
comparison. Nevertheless, the results demonstrate 
that the method can be used to calculate three di- 
mensional flows over complex geometries, given a 
grid of multiblock type. 

V. Conclusion 

An efficient and flexible multiblock/multigrid al- 
gorithm to solver the Euler equations on structured 
grids with non-overlapping inter-block boundaries 
is developed and validated by calculating sev- 
eral two and three dimensional flow problems. 
The fully conservative treatment of the inter-block 
boundary allows the passage of discontinuities 
across block boundaries with minimum distortion 
of the solution. Results have demonstrated the ac- 
curacy and functionality of the present multiblock 
method. The method is ready to be extended to 
solve the Navier-Stokes equations. 
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Figure 3: Convergence history; three-block grid; Figure 5: Block-structured grid with discontinuous 
NACA 0012 airfoil at = 0.8, a = 1.25° inter-block boundaries 
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Figure 4: Convergence history; single grid; NACA Figure 6: Constant pressure contours: NACA 0012 
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Figure 15: Constant Mach contours at symmetry 
plane; at = O^p^/p^ = 1.11 
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Figure 14: Convergence history with three levels Figure 16: Constant Mach contours at symmetry 
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Abstract 

An implicit multigrid procedure for the solution of the 
Navier-Stokes equations of viscous, compressible flow is 
described. Attention is focused on the implicit inclusion 
of the viscous contributions to the equations in a way 
that will enhance the stability, yet not disturb the effi- 
ciency of the algorithm. Two- and three- dimensional 
laminar flows are computed to demonstrate the stability 
and efficiency of the scheme. 


1 Introduction 

In the numerical simulation of viscous flows at high 
Reynolds numbers, it is necessary to resolve the thin 
shear layers which develop near solid boundaries. Such 
thin shear regions require the use of grids with cells 
of very high aspect ratio, which are known to hin- 
der convergence for steady problems when using ex- 
plicit schemes. To overcome these difficulties, Caughey 
has developed a diagonal Alternating Direction Implicit 
(ADI) algorithm for the solution of the Euler equations 
of inviscid, compressible flow [1], Rapid convergence is 
achieved with the use of the implicit scheme within a 
multigrid framework. 

Here, the method is extended to solve the Navier-Stokes 
equations. Attention is focused on methods of adding 
the viscous contributions in a way which does not dis- 
turb the overall stability and efficiency of the implicit 
scheme. No attempt is made here to incorporate a tur- 
bulence model, so the discussion will be limited to lam- 
inar flows. 
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2 Governing Equations 

Compressible viscous flows are governed by the Navier- 
Stokes equations. These equations require that both 
streamwise and normal viscous diffusion be computed. 
Under certain conditions, it is possible to neglect 
streamwise diffusion, resulting in the thin layer equa- 
tions. Both the Full Navier-Stokes (FNS) equations and 
the Thin Shear Layer (TSL) approximation to those 
equations are considered here. The development of 
the algorithm will be described for the two-dimensional 
equations. Extensions to three-dimensions are straight 
forward; equations relevant to the three-dimensional al- 
gorithm are listed in appendix A. 

2.1 Navier-Stokes Equations 

The nondimensional form of the Navier-Stokes equa- 
tions is written 

dxv dig , d lc = d Lv , d lv m 

dt dx dy dx dy 

This system represents conservation of mass, momenta, 
and energy. The vector of conserved dependent vari- 
ables w consists of the density, cartesian momentum 
flux, and total energy per unit volume and is written 

w= {p,pu,pv,e} T . 

The three-dimensional form contains five elements since 
a third momentum equation is required. The convective 
flux vectors in the x- and y- directions, respectively 
and £ c , and the viscous flux vectors f_ v and g y are 
given by 

ic = {pu,pu 2 + p,puv,{e + p)u } T , 

2c = {P v ’P uv >P v2 +P.(e + p)u} T , 
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fjy — {0) T XXl r xy\ Px} I 

9y — {0, Tyx, Tyy, (3y } . 

The viscous shear stresses and the heat fluxes are of the 
form 

T XX = 2/iu x + A (u x + Vy ) , 

Tyy = 2 liVy + A (ti X + Vy ) , 

T X y — Ty X = fi (tty + V X ) , 

Px = (t ■ u) x + kT x , 

Py — (? ' u )y "b kTy, 

where k is the coefficient of thermal conductivity and T 
is the temperature. The second coefficient of viscosity 
A can be related to the molecular viscosity p by Stokes’ 
hypothesis 

* = -fc. (2) 

An equation of state is needed to relate the pressure and 
total energy: 


P = ( 7-1) 


e 


\p{v 7 + v 



(3) 


are the transformed flux vectors. The contravariant ve- 
locities U and V are related to the cartesian velocities 
by 


(?M:) 


where J is the Jacobian of the transformation written 
as 


J = 



Vy . ' 


The determinant of the inverse of the Jacobian is h 
which is written 


h = j J -1 


x t x n 
Vv 


and which corresponds to the cell area. The metrics are 
computed from the grid coordinates by 

h-£,z — Vy , 

h£y =• —X , ), 

hr]x = -Pf, 
hrjy = Xf. 


where 7 is the ratio of specific heats. 


2.2 Thin Layer Equations 


To allow treatment of arbitrary geometries, these equa- 
tions are transformed into curvilinear coordinates and 
written in Strong Conservation Law (SCL) form as 

dw dFg dGc _ dF v ' 8G V 

dt ^ di dr, d£ dr, ’ V ’ 


where W_ = hw_ is the transformed dependent variable 


and 


Ec(m = h 


/ pU 

pUu + $ x p 

PU V + £yP 

\ (c 4- p)U 


Gcim = h 


f pv 

pV u + rj x p 
pVv + T)yP 

V (e + p)V 


\ 

/ 

\ 

/ 


E.v(W,K < ,W JI ) = h 


( ° 

ZxTxx + Zy T xy 
£x x xy "b ty x yy 
\ £zPx + tyPy 


GviYLWz,^) = h 


l 0 

r)x r xx “I" f]y Xxy 
Vx^zy b Vy x yy 
\ T)xPx + VyPy 


\ 

/ 

\ 
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Under certain conditions it is possible to neglect viscous 
diffusion in the streamwise direction without adversely 
affecting the quality of the solution. This simplification 
requires that the flow have a predominant direction, and 
be without massive separation. High Reynolds number 
flows over wings are one such example. Implementation 
of such a model requires that body surfaces be mapped 
onto coordinate surfaces, and there be sufficient cluster- 
ing normal to the shear surface to allow the boundary 
layer to be resolved. It can be argued that even if the 
complete equations are used, viscous diffusion in the 
streamwise direction cannot be resolved unless the grid 
is sufficiently fine in that direction [2], and for many 
practical flows, current computational limitations pre- 
vent the use of grids with sufficient resolution in both 
the normal and streamwise directions. 

The transformed viscous flux vectors can be decoupled 
into components which depend only on the vector of 
dependent variables and its derivative in either the £- 
or 77- direction: 

F v = 

= (5) 

Gy = Gym^HG,) 

= Gyi^LW^) + Gy{W^W^). 


( 6 ) 



The thin layer approximation entails retaining only the 
surface normal- or 17 - derivatives from the viscous terms 
in the Navier-Stokes equations (Eqs. (4)); that is, only 
the term GyiW^Wj)) is kept when the body surface is 
a line of constant tj. The thin layer equations are then 
written 

dG^ dG v 


with 


and 


dW dFc 
dt + a* 


+ 


dr) . dr] 


G v (W 1 W J1 ) = h 


/°' 

4 * TJyTxy 

T) X Txy + TjyTyy 

\ VxPx + VyPy 


\ 


(?) 


T xx 

= 2 / 117 * 0 ,, 

+ 

KWr, 

+ 

Wn)’ 

Tyy 

= 2 /!T 7 s,v„ 

+ 


+ 

T ly v n)y 


= r yx = 



+ 

nxVr,), 

K 

= (^ • «)* 

+ 




K 

= (r ■ u) y 

+ 

kVy'T'r)- 




3 Numerical Method 


3.1 Finite Volume Formulation 


Spatial discretization of the governing equations is ac- 
complished with a finite volume scheme similar to that 
of Jameson et al [3]. The physical domain is partitioned 
into quadrilateral cells and a conservative flux balance 
is applied to each cell. The net flux across the cell faces 
is evaluated assuming a constant value of velocity on 
each face. In the present implementation, discretized 
solution variables correspond to cell averaged quanti- 
ties. That is, 

mj = -r— [ wdS, ( 8 ) 

n ',j Jhij 


where h{j is the area of grid cell (*, j). This average 
value does not correspond to any particular location 
within the cell, but for convenience may be thought of 
as the value at the cell center. The values on the cell 
faces are computed as simple averages of the values in 
adjoining cells. For example, the value of ti on the face 
shared by cells ( i,j ) and (t + 1 , j) is 


u 




Hi+u + (pu)ij 
(p)i+l,j + (p)ij 


(9) 


Discretization of the viscous terms requires that first- 
order derivatives be known on the cell faces. By appli- 
cation of Green’s theorem the cell averaged derivatives 


are computed by a line integral over the cell boundaries, 
so that, for example, 


( 1 f du 

bijkj* 



(10) 


where, consistent with the finite- volume approximation, 



+ 


with 


u i+b.i A yi+hJ + u u+', A yi,i+h 

u i-iJ Ay <-hd + u id-b (U) 


A y'+k.j = yi+±,j+± ~ yt+±,j-±> 

A y<j+ t = y>- %,j+ y — y>+ 

A y>-$,j = ~ y«- 

A y*,j-h = y '+bi-b ~ y'-hi-h' 

and the velocity u computed as shown in Eq. (9). The 
approximation of Eq. 11 is appropriate for the FNS for- 
mulation. An approximation consistent with the TSL 
formulation is 

u dy = u iJ+ ±Ay iij+i + ti,j_ jA j. ( 12 ) 
J oh 

The values on the faces are then taken to be the averages 
of the values in neighboring cells. 

3.2 Numerical Dissipation 


Applying the finite volume discretization to each grid 
cell, a system of ordinary differential equations emerges 
which has the form 

- -Qclllij + QvUUj< (13) 

where Q e and Q v are centered difference operators rep- 
resenting the convection and viscous diffusion terms. 
Use of a centered difference scheme for the Euler equa- 
tions may lead to solutions which are decoupled at odd 
and even grid cells due to the first differences of the con- 
vection terms. The addition of some form of dissipation 
is required to ensure convergence to a steady state. In 
comparison, the Navier-Stokes equations are naturally 
dissipative and, in a centered difference scheme such as 
that described here, are coupled by second differences 
of the viscous terms. However, to suppress the growth 
of nonlinear instability in regions of the flow where vis- 
cous damping is inadequate, and to prevent oscillations 
in regions where the grid is incapable of resolving the 
gradients, such as near shocks, adaptive dissipation sim- 
ilar to that described by Jameson [3] and modified by 
Caughey [1] is added to the scheme. 



With the added dissipation, the difference approxima- 
tion is written 

= -Q'UU,j + Qvnu,j + D m,j> ( 14 ) 

where D is the difference operator representing the ar- 
tificial dissipation. Following Caughey [1], D is written 

D UU,j = + D f,UU,j (15) 

where 


= 6t(dam,j) (16) 

D, = S.id.yuj). (17) 

and and 6 V are central difference operators spanning 
a single grid cell. In the operator D(, 


^ = {4 2) ^-^(4 4) ^ 2 )>^,i- ( 18 ) 


where the coefficients and are adapted to the 
solution using a switch which measures the magnitude 
of the change in pressure gradient across a cell defined 
by 


„ _ Ip.+i.j ~ 2p,-,j +p,~- i,j-| 

(, ' J Pi+ij + 2 pij + Pi-i,j 


(19) 


To limit spurious dissipation in regions close to solid 
boundaries, i.e. where the physical dissipation is most 
important, the coefficients are scaled by a function of 
the local Mach number M , and so are then written 





X 

max(i/ f i+1 u Uj ) 

(20) 

and 



4J> = max(0, /c^ 4 ) 


(21) 


It has been suggested by several researchers [4] that 
the Mach number scaling function /(M) be different 
depending on whether the flow is separated or attached; 
here no such adaptation was attempted and for most' 
cases 

/(m)= G0’ (22) 

was found adequate. When using the TSL approxima- 
tion, only dissipation in the rj— direction is scaled by 
the Mach number due to the lack of natural dissipation 
in the f— direction. 


To minimize the amount of added dissipation, the con- 
stants k and should correspond to the smallest 
values which will allow convergence to steady state and 
prevent spurious oscillations. For the present calcula- 
tions, values of = 2 and k ^ = 1/32 were used. 


3.3 Iterative Scheme 

To advance the solution in time, an ADI procedure suit- 
able for the Navier-Stokes equations has been devel- 
oped. Such a scheme begins by approximating spatial 
derivatives at new and old time levels, linearizing the 
changes in the flux vectors, and approximating the im- 
plicit operator as a product of one-dimensional factors. 
The implicit operators are then diagonalized to improve 
the computational efficiency of the scheme. Special at- 
tention is directed at methods to include viscous contri- 
butions in the implicit operators. 


3.3.1 Delta Form 

The first step in developing an ADI scheme is to ap- 
proximate the spatial derivatives as weighted averages 
at new and old time levels. Such an approximation to 
the full Navier-Stokes equations (Eq. (4)) can be written 

A ms + 0A (£2# - E2ij) + *,(££? - &ij) 

- - &,j) - 6n(G3r% 1 - CiVij)} 

= -A t{6(E2m + S*GZi, - StLvu - *->£vy}.(23) 

where AW?- = W"* 1 — W? - is the correction added to 
the solution, and 6 represents the degree of implicitness 
with 0 < 9 < 1. Eqs. (23) are commonly referred to as 
the “delta” form derivation [5] since increments of the 
conserved variables and fluxes are involved. 

For this scheme to be computationally efficient, it is 
necessary to linearize the implicit operator and approx- 
imate it as a product of one-dimensional factors. Un- 
like the Euler equations, the multidimensional Navier- 
Stokes equations are spatially coupled due to the pres- 
ence of mixed derivatives in the viscous stress tensor. 
Therefore the implicit operator cannot be factored di- 
rectly into one-dimensional components. The problem 
can be avoided by neglecting the mixed derivatives in 
the the left-hand side operator of Eqs. (23). This is 
accomplished using Eqs. (5) and (6) to give 



= -Af{6 e ££ 0 - + S'CEa, - ~ ^£vy}>( 24) 

Neglecting the mixed derivatives in this way does not 
alter the consistency of the approximation. Nor can it 
affect the converged solution since the right-hand side 
of the equation - the residual vector - remains intact. 
It can alter the stability properties of the scheme, but, 
as will be demonstrated, the scheme remains uncondi- 
tionally stable according to linear stability theory [6, 7]. 
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In the TSL formulation there are no mixed derivatives, 
and so the delta form approximation is 


The viscous flux Jacobians are 


A YSi + Ecij) + Sni&cti ~ Oca ) ' 

— (&vij — Gvij ) } 

= —At{S(F^ij + S n Q2:ij ~ (25) vs 


where 


In the remaining development only the FNS approxima- 
tion will be considered. The analogous TSL schemes can 
be readily derived by neglecting the appropriate terms. 


3.3.2 Linearization 

The changes in the convective flux vectors can be lin- 
earized using local Taylor series expansions in time to 
give [8] 

£&/-£&« = A?.AW$ + 0(At 2 ), (26) 

S&tf-G&ij = Bh.AWa*.+0<At 2 ), (27) 

where A = {dF^/dW} and B = {dG c /dW} are the 
Jacobians of the transformed convective flux vectors 
with respect to the solution and are given by Warming, 
Beam, and Hyett [9] and by Chaussee and Pulliam [10]. 

Since the transformed viscous flux vectors and Gy 
are functions of W and or respectively, the ap- 
propriate linearizations are 

Ev+l-tvij = K? J AW^. + L? i AW^, 7+ 0(At 2 ) 

= (K, 7 - L w rA]£". 

+ ^(L[* J .AWr i )+0(At 2 ), (28) 

alV-avij = M“A£P” + + O (At 2 ) 

= (M, 7 - N„ 7 ) n Ai^. 

+ J- (n^AW".) + o (At 2 ) , (29) 

where K = {8F v /dW} and M = WG*, IdW} are 
the Jacobians of the transformed viscous flux vectors 
with respect to the solution and L = {dF v /dW and 
N = WGw IdW n 1 are the Jacobians with respect to the 
derivatives of the solution. Recognizing that K-Lj = 0 
and M — N, = 0 if the the transport coefficients are ap- 
proximated to be locally constant [6, 11], the lineariza- 
tions reduce to 

IvV-lva = ^(L^AW?y) +0(Af 2 ), (30) 
alt- -Gin = |-(N])AW? i )+0(At 2 ). (31) 


L, N = 


0 

0 

0 

0 

£21 

e22 

e23 

0 

£31 

e32 

e 33 

0 

£41 

e42 

e43 

e44 


622 = a ”G)’ 

623 = a **Q)- 

e 31 = -a xy (^j - a yy , 

£32 = £23) 

£33 = &yy ) 

£41 = -«** (7) (y) ~ a ™ (7 ) 

( e u 2 + v 2 \ 

+ °'{-? + — )’ 

£42 = -ate^'J-e 21, 

f v\ 

£43 = — a e I - I -e 31 , 

e 44 = ot e , 


Qxx = (2|i + A)k 2 + pKy, 

Qxy = (^ + A)« x « y , 

atyy = /i* 2 + (2fi + A)k 2 , 

ar e = juPr-'inl+K*). 

and k = £ or 77 for L and N, respectively, and Pr is the 
Prandtl number. This matrix contains no derivatives of 
the solution variables, unlike the coefficient matrix in 
Ref. [12] derived by a similar method. 

Introducing the approximations of Eqs. (26) ,(27), (30), 
and (31) into Eqs. (24) and 'including terms from the 
artificial dissipation results in the scheme 

{I -I- 0At[A, 7 5 e + B, 7 <5, - <5 2 L, 7 - 5 2 N, 7 

" *$(«! + W h ) + + 5t){l/h)]Y^ 

= -At - W, 7 - Sr.GHnj 

- + 6*) mj + 4J(5| + «, 4 k 7 r- (33) 

For simplicity, the numerical dissipation coefficients in 
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Eqs. (33) are written as if they were the same for the 
£— and rj— directions, although they differ in reality. 

Approximating the left hand side of Eqs. (33) as the 
product two one-dimensional factors - one for each of 
the spatial directions - results in a block ADI scheme 
and is written 

{1 + dAt[Aij6{ — SfLij 

x {I + 0At[3ijS v — 

- e$«»(l /h) + e$S*{ 1/A)]}"AW?. 

= - A t {S'EZn + ~ S t£-vij ~ Sn&ti 

~ «$(*? + 6#yUj + *$(«? + & $)VU,)) n - (34) 

To advance the solution one time-step using Eqs. (34), a 
two step procedure - one for each of the implicit factors 
on the left hand side - is used. In the first step, the 
intermediate correction AW*.- is determined by solving 
the block pentadiagonal system 

{1+ 0At[A,j£{ — 

- (1/fc) + eVsKl/hWAW-j = Buj (35) 

along each line of constant tj. Here JJ,- • is the residual 
vector corresponding to the right hand side of Eqs. (34). 
The correction AW." is then determined by solving the 
system 

{I + 9At[BijS v — 

- e?]6*{l/h) + e^6*(l/h)}} n AW2j = AW*,- (36) 

along each line of constant £. The size of the blocks 
is 4 x 4 for the two-dimensional problem which corre- 
sponds to the order of the system of equations. For the 
three-dimensional problem, the blocks are size 5 x 5; in 
addition, a third factor corresponding to the additional 
spatial direction is required. 

3.3.3 Diagonalization 

For the Euler equations, the convective flux Jacobians 
can be diagonalized with local similarity transforma- 
tions as A =t QaAaQa* and B = QbAbQ g 1 , where 
A a and A b are diagonal matrices whose diagonal ele- 
ments are the eigenvalues of their respective Jacobians, 
and Qa and Q# are the modal matrices whose elements 
can be found in [11]. This allows the block equations 
to be decoupled into equations which can be solved 
as scalar pentadiagonal systems, greatly reducing the 
amount of computational labor needed for a solution. 


For the Navier-Stokes equations, it is not possible to 
both include the viscous terms in the implicit factor and 
to diagonalize the system, since the convective and vis- 
cous Jacobians are not simultaneously diagonalizable. 
Several alternatives exist to circumvent this problem. 

Method 0: Explicit Treatment If viscous contri- 
butions are neglected completely from implicit consid- 
eration, a diagonalized system can be written 

{I-f 0At[AAij6f 

xQs»i{I+ 

- e<$6l(l/h) + e\ 4 JS*(l/h ))) QifcAlEi 

= -A t ~ 

- ( 3? ) 

Neglecting the viscous terms from the implicit fac- 
tors may jeopardize the stability of the scheme. It is 
desirable to maintain the efficiency of the diagonalized 
scheme without degrading its stability properties, so al- 
ternate approaches must be explored. 

Method L Implicit Approximation The first im- 
plicit method consists of using the largest eigenvalue of 
the viscous Jacobian to add contributions to the inviscid 
implicit factors. This is similar to what was suggested 
by Pulliam [12]. The eigenvalues of L,N are 



(38) 

A, = (£) (*2+*;), 

A4 = 0, 

where Pr is the Prandtl number. 

A scheme is constructed by adding the diagonal approx- 
imations A 1 « Q^LQa, and A ^ « Qj^NQ b to the 
appropriate implicit factors: 

{1 + 6At[AAijSz — 

-«£?«|(i/*) -i-^d/AJDQa 

x Qsi; {I 4* 0At[A Bij^n — 

- e$6l(l/h) + ^(l/ZOUQ^-AWS 

= — At QAij{6(£cij + t>n Gcij ~ hE-Vij ~ t'nGS'ij 

~ + S n)^J + «$(«* + (39) 
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1 

0 

0 


The diagonal approximations and A $ are for exam- 


pie 


K^+O 1 . 

(40) 

and 


1 (jll + »?y) I, 

(41) 


where I is the identity matrix. The number of additional 
operations needed to implement this scheme is negligible 
since it involves only the calculation of an eigenvalue 
whose analytical form is known. 


Method II: Additional Operator A second option 
is to use additional implicit operators which contain the 
exclusive contributions from the viscous terms. The ad- 
dition of the viscous Jacobians directly to the operators 
would require the solution of block tridiagonal systems. 
However, since the eigenvalues of the viscous Jacobians 
are distinct (Eqs. (38)), modal matrices Q i and 
exist which diagonalize L and N, respectively, through 
a similarity transformation. This results in the diagonal 
scheme 


— t; 

— - ■ . 

+ " jj 

0 


(44) 


where k = £ or rj for or Q^, respectively, and 

Kx . 


Kg — 




y/ K l + K \ 

9 = k x u + k y v, Jt = k x v — k y u. 


The three-dimensional viscous Jacobi matrices and their 
eigenvalues are given in the appendix. Although the 
eigenvalues are not unique in the three-dimensional 
case, the matrices can still be diagonalized. The multi- 
ple sets of modal matrices, which are possible because of 
the degenerate eigenvalue, are also listed in the appedix. 


3.4 Convergence Acceleration 

3.4.1 Local Time Stepping 


xQflij{I + 9At[A B ij6 n 

- $6 7 n(W + ^ 4 (lM)]}Qiy 

xQ^.{I - 0At^A^..}Q-i.AWOi 
= —At Q -^{StEcij + Sv&a ~ h&u ~ *nGviS 

- e?M + 6 n)^,i + i>". ( 42 ) 

where K Uj = Q^LQ^ and A* fj . = Q^NQ*,; are 
diagonal matrices whose nonzero elements are the eigen- 
values of L and N, respectively. In this scheme, two ad- 
ditional scalar tridiagonal systems must be solved; for 
the TSL approximation, only one additional factor ap- 
pears in the equation. The viscous modal matrices are 
written 


For problems in which only the steady-state solution is 
of interest, the goal for any iterative procedure is to 
reach that solution as quickly as possible. One way of. 
accelerating the convergence is to use local time steps 
in which a time step is proportional to the size of the 
grid cell. The directional time steps are based on one- 
dimensional inviscid wave propagation corresponding to 
unit Courant number and are written 


Ate = 


At, = 


|y„u - x v v\ + ay/xr, 2 + y„ 2 ’ 

h 


|-y € u + X£v\ + a^/if 2 + y c 2 
A local time step At is then defined as 

A«-A 

Atf 4- A tq 

where A is the Courant number. 


(45) 

(46) 


(47) 
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-j + u 2 + V 2 

—u 



- m ~ m 


1 

0 


3.4.2 Multigrid 

(43) 

The resulting ADI scheme has good high wave num- 
ber error damping characteristics. Effective removal of 
low wave number error is accomplished using a multi- 
grid scheme in which corrections to the solution are re- 
cursively computed on a hierarchy of grids. The algo- 
rithm is based on that originally developed by Jame- 
son [13] and described by Caughey [1] and Smith and 
Caughey [14]. 



A coarse grid is created by removing every other grid 
line from the fine grid, essentially doubling the grid 
spacing in each direction. The multigrid cycle begins 
by advancing the solution a fixed number of time steps 
on the fine grid using a time stepping procedure such as 
that described earlier. Next, the solution is restricted 
to the coarse grid using area-weighted (volume-weighted 
in 3D) averages. The fine grid residuals are also re- 
stricted since they are needed to drive the solution on 
the coarse grid. A forcing function is added to the coarse 
grid residuals and is defined as the difference of the re- 
stricted fine grid residuals and the initial coarse grid 
residual. It is important to use such a forcing function 
so that the fine grid accuracy is maintained. The proce- 
dure is then repeated by recursively starting a number 
of multigrid cycles from the present grid level. High 
wave number error on a coarse grid corresponds to low 
wave number error on a fine grid so by using a time 
stepping procedure with good high wave number error 
damping characteristics on successively coarser grids, a 
range of wavenumber errors is eliminated from the fine 
grid solution. Once the coarsest grid is reached, the cor- 
rections are prolonged to successively finer grids using 
bilinear (trilinear in 3D) interpolation in the computa- 
tional coordinates. At each level in the prolongation, 
the time stepping procedure is again performed for a 
fixed number of iterations (time steps). 


inviscid, characteristic boundary conditions based on 
the Riemann invariants of the one-dimensional prob- 
lem normal to the boundary are used. The Riemann 
invariants are determined using either freestream val- 
ues or those extrapolated from the interior depending 
on the direction of propagation of their respective char- 
acteristics. On the outflow boundary, where viscos- 
ity is important in the wake, pressure is fixed to the 
freestream value, density and momentum flux are ex- 
trapolated from the interior. The energy is then com- 
puted from the equation of state. The implicit boundary 
conditions are also treated in a manner consistent with 
the characteristic theory. 


3.6 Stability Analysis 


A qualitative description of the stability properties 
of these schemes is obtained from a von Neumann 
(Fourier) analysis of an advection-diffusion equation. 
This equation with fourth-order numerical dissipation 
is written 


du 

dt 


+ 


du du 
C dx + C dy 


+ ceAx 3 


d 4 u 
dx 4 


3 d 4 u 


+ kA 0 w 


f 4 d 2 u 1 d 2 u d 2 u ) 

i.3 5?' + d^j 


(48) 


By controlling the number of iterations and multigrid 
cycles on each level, various cycles are possible. For ex- 
ample, if the multigrid procedure is executed once from 
each level, a V-cycle results. A recursive W-cycle results 
from executing the multigrid procedure twice from each 
level. Grid sequencing is used to initialize a starting so- 
lution by first computing the solution on a coarse grid, 
and then interpolating the coarse grid solution to the 
fine grid. 


3.5 Boundary Conditions 

On solid surfaces, the no-slip condition is imposed by 
setting the velocity components u and v to zero. Adi- 
abatic walls are assumed and so the normal tempera- 
ture gradient at the wall is zero. The normal pressure 
gradient in the first cell off the wall is found from the 
momentum equations; for high Reynolds number flows, 
the normal pressure gradient may be set to zero at the 
wall. Once temperature and pressure are determined 
from their respective gradients, density is found using 
the ideal gas law. The energy is then computed from 
the equation of state (Eq. 3). 

On the far field inflow boundary, where the flow is nearly 


This equation serves as a model for the full approxima- 
tion of the Navier-Stokes equations. A model for the 
thin layer approximation is obtained by removing the 
viscous derivatives in the x— direction. Substituting the 
Fourier term = G n e l P* x e l P'> v , and constructing an 
ADI scheme analogous to that described earlier leads to 

(G — 1) {1 -f J0A* sinf + 16^Aj:esin 4 ^ 

+ y<r/0vA,Re r _ 1 sin 2 |} 
x {1 + i0A x Aff -1 sin 77 -I- 160A r AR _1 esin 4 ^ 

+ 40v A x Re x ~ l AR~ 2 sin 2 
= —A* {i(sin£ -f AR” 1 sinr;) 

+ 16e(sin 4 ~ + AR~ l sin 4 ^) 

ml mt 

16 c 1 

+ Re x ~ 1 (—a’/ sin 2 £ + -<rjAR~ l sinfsinr; 

o 2 u 

+ 4 AR~ 2 sin 2 ^)). (49) 

A 

From Eq. (49), the magnitude of G can be calculated, 

|G| = f(Z,i];\ x ,Re t ,AR,e,d,6v), 

where £ = /? r Ax and rj = j3 y Ay are the grid wave num- 
bers, 6 and Ov are the implicit parameters, the latter 



corresponding to the viscous terms. The model switch 
cry is set to 1 when modeling the FNS approximation, 
and to 0 when modeling the TSL approximation. In ad- 
dition to the Courant number A x = cAt/Ax and artifi- 
cial dissipation e, the numerical stability of the implicit 
viscous equations is governed primarily by the aspect 
ratios AR = Ay/Ax of the mesh cells and the mesh 
Reynolds numbers Re x = cAx/v. The expression in 
Eq. (49) corresponds to what is done in Method I. Sim- 
ilar expressions representative of Method II can also be 
derived. 


verged solution without including viscous contributions 
in the implicit factors. If additional viscous operators 
are added to the scheme (as is done in Method II), the 
solution will remain conditionally stable, although the 
region of stability is increased slightly as shown in Fig. 2. 
The stability analysis indicates that the most promising 
algorithm would be one similar to Method I. Method II 
should also be considered, however, in so far as it rep- 
resents less of an ad hoc approximation than Method I. 


Using such a model, it is found that when viscous terms 
are added directly to the convective operators, analo- 
gous to what is done with Method I, unconditional sta- 
bility is achieved. This is true even when the mixed 
derivatives are neglected in the implicit operator. If 
the viscous terms are evaluated explicitly without im- 
plicit contributions, a conditionally stable scheme re- 
sults. This can be seen in Fig. 1 in which the dark areas 
represent regions in parameter space where von Neu- 
mann analysis predicts an amplification factor greater 
than unity. This figure represents the properties of the 
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Figure 1: Method 0 

Viscous contributions included only in the explicit fac- 
tor. Gray areas are regions of instability where the 
growth factor is greater than unity 

scheme applied to a model problem using values of the 
dissipation parameters representative of those used in 
the computations, and a Courant number of 16. These 
results jp not rule out the possibility of obtaining a con- 
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Figure 2: Method II 

Viscous terms are included implicitly in an additional 
factor. Gray areas are regions of instability where the 
growth factor is greater than unity 


4 Results 

Both implicit methods are implemented in a computer 
code to calculate transonic flows. A number of test cases 
have been computed for flows past a two-dimensional 
NACA 0012 symmetric airfoil. For all cases presented, 
a simple power-law is used for the dependence of the 
viscosity coefficient on temperature. The coefficient of 
thermal conductivity is proportional to the molecular 
viscosity under a constant Prandtl number assumption. 

The first case presented is for subcritical laminar flow 
(iZe = 5000, Mqo = 0.5) past a two-dimensional NACA 
0012 symmetric airfoil at zero degrees angle of attack. 
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The calculation is performed using the thin-layer ap- 
proximation on a 256 x 64 cell algebraically generatated 
“C”-grid. A section of this grid is shown in Fig. 3. The 
outer boundary of the grid is located about 10 chords 
from the body. Care is taken to insure sufficient cluster- 
ing in the region close to the body surface where viscous 
effects are significant. Approximately 25 grid points are 
included within the boundary layer at the airfoil trailing 
edge, and the first point normal to the body surface is 
located at about .001 chords. 



a.m 


257x65 


Figure 3: Section of 256 x 64 “C”-grid 

The surface pressure distribution, presented in Fig. 4, 
agrees well with that presented by by Martinelli^ Jame- 
son, and Grasso [15]. The flow separates at approxi- 
mately 82% of the chord, as can be seen from the skin 
friction distribution in Fig. 5; this value is close to that 
reported by other researchers [16, 17, 18] for this case. 
The pressure and friction drag coefficients are found to 
be 0.02238 and 0.03289, respectively, which are very 
close to the values reported by Venkatakrishnan [18]. 
Using Method I with 6 levels of multigrid, the solution 
converged to a steady state (defined as within 0.1 % of 
the fully converged solution) in approximately 50 work 
units which corresponds to 30 multigrid cycles. One 
work unit is defined as the amount of work required 
to advance the solution one time step on the finest grid 
level; for a simple V-cycle - the strategy used here, each 
multigrid cycle requires approximately 1 2/3 work units. 

To illustrate the effect of the implicit schemes described 



Mm* MOO U ftm 0000 Wm U OOO hO 

a — *4100* C M Cm L.TOOO*— OO COf 0 J H H 

1st 10* on 

Figure 4: Surface Pressure Distribution 
Re = 5000.0, Moo = 0.50, a = 0.0 

earlier, the above case is computed on a 192 cell “C”- 
grid generated using the GRAPE code elliptic mesh gen- 
erator [19]. The iterative process is begun by initializing 
the solution to free stream values. Plots of the conver- 
gence histories for the different methods are shown in 
Figs. 6 and 7. Five levels of multigrid are used for the 
results in Fig. 6, and only a single grid is used for the 
result shown in Fig. 7. Local time stepping is used for 
all results. The error curves plotted are the normal- 
ized root mean square of the density residual. Using 
Method I with multigrid, the solution converged to a 
steady state in approximately 35 work units which is 20 
multigrid cycles. 

This solution is computed at a Courant number of 16 
using Method I. At this Courant number, both Meth- 
ods I and II produce converged solutions as illustrated 
in Fig. 6. Here, the asymptotic convergence of Method I 
is somewhat better than Method II. A converged solu- 
tion is not attainable if viscous terms are neglected from 
the implicit factors (Method 0) at a Courant number of 
16. However, the scheme does converge at the expense 
of a lower Courant number, hence, a slower convergence 
rate, as shown in Fig. 6. This demonstrates the impor- 
tance of maintaining an implicit viscous contribution in 
the numerical scheme. The convergence rates for all 
schemes suffer without the use of multigrid. For ex- 
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Figure 5: Skin Friction Distribution 
Re = 5000.0, Moo = 0.50, a = 0.0 


Figure 6: Convergence Histories using Multigrid 
Re = 5000.0, Moo = 0.50, a = 0.0 
Methods I, II at Courant number 16 
Method 0 at Courant number 2 


ample, approximately 500 work units - fourteen times 
more than that needed when using multigrid - are re- 
quired before the drag coefficient converges to a steady 
state as shown in Fig. 7 for Method I. . 

The second case tested is the transonic laminar flow 
(Re = 500, Moo — 0.8) past a NACA 0012 airfoil at ten 
degrees angle of incidence. The region of separated flow 
is quite substantial for this case and necessitates the use 
of the full Navier-Stokes equations. The 256 x 48 cell 
“C”-grid described earlier is used for this case. Plots 
of the pressure coefficient and Mach number field are 
show in Figs. 8 and 9. The solution appears to be in 
reasonable agreement with that obtained by Martinelli, 
et al. [15]. In this case, the flow separates from the 
upper surface of the airfoil at approximately 25% of the 
chord. 

To test the algorithm in three-dimensions, a calculation 
of subsonic flow past a NACA 0012 wing with zero sweep 
angle and an aspect ratio, of 12 is compared with the 
analogous two-dimensional case. The calculation is per- 
formed on a 192 x 48 x 12 cell “C-H”-grid, and Method I 
is used for the implicit algorithm. In Fig. 10, the distri- 
bution of pressure coefficient on the wing’s center plane 
is compared with the two-dimensional case, and the ag- 
greement is quite close. The three-dimensional results 
are preliminary and more extensive tests are underway. 



Figure 7: Convergence History without Multigrid 
Re = 5000.0, Moo = 0.50, a = 0.0 
Method I at Courant number 16 
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Figure 8: Pressure Coefficient 
Re = 500.0, Moo = 0.80, a = 10.0 
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Figure 10: Surface Pressure Distribution Comparison 
He = 5000.0, Mqo = 0.50, a = 0.0 
Three-dimensional case - * 
Two-dimensional case - □ 



5 Conclusions 

The importance of including viscous contributions in 
the implicit operator has been demonstrated. Although J 
several options are available for implicit inclusion, the 
addition of approximate terms to the existing opera- 
tors (Method I) seems the most effective. The laminar 
solutions obtained are in good agreement with results 
reported by other researchers [17, 15, 16, 18]. Work 
is underway to incorporate a turbulence model so that 
engineering flows can be studied. 
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A Three Dimensions 
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The viscous flux Jacobians in three-dimensions are 
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The eigenvalues of L, N, S are 
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Three sets of modal matrices associated with the viscous 
Jacobi matrix are 


Set 1: 


Q L’ Q# > Qs 




' 0 

0 

0 

0 

1 “ 

*r 

0 

0 

Kz 

u 

Ky 

0 

Kz 

0 

V 

k z 

0 

-K y 

— Kz 

w 

9 

1 

-Mv x 

-Mxz 

p - 


£ 

_ „K 3 

K r 

K 2 

U 2 + V 2 -f w 2 — - 
• a a • P 

— u 

K 3 r^ry"K* My* 

k x k v 



Mar* 


K B * 2 

K, A 2 

1 

0 


(A3) 


14 



A 3 

— v 

*Z+* 


Kj K 3 
^ y 

k s k 2 

0 


S3 

— w 


-5* 


S3 


0 

1 

0 

0 

0 


(A4) 


Set 2: 


Q£.Qjv>Q5 


0 

0 

0 

0 

1 

k x 

0 

0 

Ky 

ti 

Ky 

0 

k t 

~k x 

U 

k z 

0 

K y 

0 

w 

8 

1 

-fiyz 

—fixy 

<5 

P 


(A5) 


Q^’Q^’QJ 1 = 


U 2 + V 7 + u > 2 - - 
- - , - - p 

Hzllxt'¥ K yl i yz 
A y A 3 

ky fixy~\~k t fix* 

a v a 3 

1 

k y k . 

k 3 
— V 

**. 

K 3 


A 3 

0 


*3 
— ti 
k t k t 

Ay A 3 

a„ a 3 


a 3 

—W 

S*J.S 


k y k 2 

k^k x 

a«a 3 


(A6) 


Set 3: 


Qw> Qs — 


0 0 


kx 

0 

Kz 

Ky 

u 

Ky 

0 

0 

~K X 

V 

Kz 

0 

-K x 

0 

w 

8 

1 

-fizz 

~(*xy 

e. 

p 


(A7) 


Q7 1 .q?.q; 1 = 


\ n * 3 0 

*3 

1 + V 2 + W 2 — J 

— u 

^r^ri + ^v/*Vi 

_ a x a 3 

K , 

S 3 

^rMrv“^*/lyi 

A X A 3 

it 

S 3 

1 

0 

^ y A 1 

S 3 *3 

0 

—V —w 

1 

i,(i 

0 

S*S 3 S t S 2 

«?+*? *y*« 

A X A 3 A x A 3 

0 


o 


(A8) 


where « = £, 77, or C, and and k = £, 77, or £ for L, N, 
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I. Introduction 

I N the numerical simulation of viscous flows at high Rey- 
nolds numbers, it is necessary to resolve the thin shear 
layers that develop near solid boundaries. Such thin shear 
regions require the use of grids with cells of very high aspect 
ratio, which are known to hinder convergence for steady prob- 
lems when using explicit schemes. To overcome these diffi- 
culties, Caughey has developed a diagonal alternating direc- 
tion implicit (ADI) algorithm for the solution of the Euler 
equations of inviscid, compressible flow. 1 Rapid convergence 
is achieved with the use of the implicit scheme within a multi- 
grid framework. 

Here, the method is extended to solve the Navier-Stokes 
equations. Attention is focused on methods of adding the 
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viscous contributions in a way that does not disturb the overall 
stability and efficiency of the implicit scheme. No attempt is 
made here to incorporate a turbulence model, so the discus- 
sion will be limited to laminar flows. 

II. Governing Equations 

Compressible viscous flows are governed by the Navier- 
Stokes equations. These equations require that both stream- 
wise and normal viscous diffusion be computed. The nondi- 
mensionalized form of the full Navier-Stokes (FNS) equations 
is written in curvilinear coordinates as 

dW dF c dG c dF y 3G V 

( 1 ) 

dt df dij d£ d V 

where W is the transformed dependent variable, F C (W) and 
G C (JV) are the convective flux vectors, and F V (W, W f , W„) and 
G V {W, fVf, are the viscous flux vectors. An equation of 
state is used to relate the pressure to the total energy. 

Under conditions in which the flow has a predominant 
direction and is without massive separation, it is possible to 
neglect viscous diffusion in the streamwise direction without 
adversely affecting the quality of the solution. This results in 
the so-called thin layer approximation (TLA). 

The viscous flux vectors can be decoupled into components 
that depend only on the vector of dependent variables and its 
derivative in either the £ or r) direction: 

Fy = Fy(JV,tV ( ,tV v ) = Fy(JV,JV ( ) + F V (W,W,) (2) 

Gy = Gy(JZ}%,E V ) = &(}Zm) + Gy(E,E,) (3) 

The TLA entails retaining only the surface normal or tj deriva- 
tives in the viscous terms of the Navier-Stokes equations [Eq. 
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(1)]; that is, only the term G V (W, fV v ) is kept when the body 
surface is a line of constant 77. The TLA equations are then 
written 


dW dFc dG c _dG v 
dt d£ dt] dr) 

III. Numerical Method 

Spatial discretization of the governing equations is accom- 
plished with a finite volume scheme similar to that of Jameson 
et al. 2 To prevent oscillations in regions where the grid is 
incapable of resolving the gradients, such as near shocks, 
adaptive dissipation similar to that described by Jameson et 
al. 2 and modified by Caughey 1 is added to the scheme. 

For problems in which only the steady-state solution is of 
interest, local time stepping is applied. A full multigrid al- 
gorithm based on that originally developed by Jameson 3 and 
described by Caughey 1 and Smith and Caughey 4 is imple- 
mented to further accelerate convergence. 

To advance the solution in time, an ADI procedure suitable 
for the Navier-Stokes equations has been developed. Atten- 
tion here is directed at methods to include viscous contribu- 
tions in the implicit operators. Only the TLA equations are 
considered in this analysis. Further details, including a deriva- 
tion appropriate for the full Navier-Stokes equations, can be 
found in Ref. 5. 

The scheme begins by approximating spatial derivatives at 
new and old time levels, linearizing the changes in the flux 
vectors, and approximating the implicit operator as a product 
of one-dimensional factors. Linearization of the convective 
flux vectors introduces the Jacobians, A = (dF c /dW) and 
B = (dGc/dJV). 6 -* Since the transformed viscous fhoTS^ is a 
function of W and W v , the appropriate linearization intro- 
duces two additional Jacobians, M = (dGy/dW) and 
N = (d(5y/diV v ). If the transport coefficients and metrics are 
approximated to be locally constant, W-^^O (Refs. 9 and 
10), and only the Jacobian with respect to the derivatives of 
the solution, (V, is needed. This matrix contains no derivatives 
of the solution variables, unlike the coefficient matrix in Ref. 
1 1 in which the metrics are not approximated as locally con- 
stant. Such linearization in the implicit operators results in a 
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Fig. 1 Regions of linear stability for methods 0, I, and II; shaded 
areas represent regions of stability where the magnitude of the growth 
factor is less than unity. 
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Fig. 2 Convergence histories using multigrid; Re = 5 x 10 3 , 
M» = 0.50, a = 0.0, methods I and II at Courant number 16, method 
0 at Courant number 2. 


block pentadiagonal system of equations when the fourth 
differences due to the numerical dissipation are included. 

For the Euler equations, the convective flux Jacobians can 
be diagonalized with local similarity transformations as 
A = Qa^aQa 1 and B = Qb^bQb 1 > where and A fl are 
diagonal matrices whose diagonal elements are the eigenvalues 
of their respective Jacobians, and Q A and Q B are the modal 
matrices whose elements can be found in Ref. 8. This allows 
the block system to be decoupled into equations that can be 
solved as scalar pentadiagonal systems, greatly reducing the 
amount of computational labor needed for a solution. 

For the Navier-Stokes equations, it is not possible both to 
include the viscous terms in the implicit factor and to diago- 
nalize the system, since the convective and viscous Jacobians 
are not simultaneously diagonalizable. Several alternatives ex- 
ist to circumvent this problem. 

One alternative is to neglect the viscous contributions com- 
pletely from implicit consideration but maintain their contri- 
bution to the explicit residual. For bookkeeping, this explicit 
treatment will be called method 0. Neglecting the viscous 
terms from the implicit factors may jeopardize the stability of 
the scheme. It is desirable to maintain the efficiency of the 
diagonalized scheme without degrading its stability properties, 
so two alternate implicit methods are explored. 

Method I: Implicit Approximation 

The first implicit method uses the largest eigenvalue of the 
viscous Jacobian to add contributions to the inviscid implicit 
factors. This is similar to what was suggested by Pulliam. 12 
The eigenvalues of N are 



X4 = 0 


where Pr is the Prandtl number, n and X are the viscosity 
coefficients, and 7 is the ratio of specific heats. 

The scheme is constructed by adding the diagonal approxi- 
mations A# ~ Qg l NQ B to the appropriate implicit factor: 


(/ + 6At A Ai jd()Q A y 

x Q m] [I + 9 At {A Bi A ~ 1 Qiu ' *EJj (6) 

= " A'&rV («t Eh + Q-aj - ^Q.vij) 
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The diagonal approximation A# is, for example, 

A* = X 2 / = fe) ( ^ + V 2 y )f (7) 

where / is the identity matrix. For simplicity, the terms arising 
from the artificial dissipation are not shown in the equations. 
The number of additional operations needed to implement this 
scheme is negligible since it involves only the calculation of an 
eigenvalue whose analytical form is known. 


Method II: Additional Operator 

A second option is to use additional implicit operators that 
contain the exclusive contributions from the viscous terms. 
The addition of the viscous Jacobians directly to the operators 
would require the solution of block tridiagonal systems. How- 
ever, since the eigenvalues of the viscous Jacobians are distinct 
[Eqs. (5)], a modal matrix Qn exists that diagonalizes N 
through a similarity transformation. This results in the diago- 
nal scheme 


(I + dAt K Ai jh^)Q Ajj { x Q mj (I + dAtA Bi j8 v )Q Bi j' 

X - 6A /ajAyg^ftgj, 1 AW$ (8) 

= - AtQ Aij l <$£&, + 8 v G” ClJ - 8 v Gyij) 

where A/jy = Q f jj,- 1 NQ$jj is the diagonal matrix whose nonzero 
elements are the eigenvalues of N. In this scheme, an addi- 
tional scalar tridiagonal system must be solved; for the full 
Navier-Stokes approximation, two additional factors appear 
in the equation. The viscous modal matrices are written 
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equations is governed primarily by the aspect ratios of the 
mesh cells and the mesh Reynolds numbers. 

Using such a model, it is found that, when viscous terms are 
added directly to the convective operators, analogous to what 
is done with method I, unconditional stability is achieved. If 
the viscous terms are evaluated explicitly without implicit con- 
tributions (analogous to method 0), a conditionally stable 
scheme results. This can be seen in Fig. 1 in which the shaded 
areas represent regions in parameter space where von Neu- 
mann analysis predicts an amplification factor less than unity. 
This figure represents the properties of the scheme applied to 
a model problem using values of the dissipation parameters 
representative of those used in the computations and a 
Courant number of 16. These results do not rule out the 
possibility of obtaining a converged solution without includ- 
ing viscous contributions in the implicit factors. If additional 
viscous operators are added to the scheme (as is done in 
method II), the solution will only remain conditionally stable, 
although the region of stability is increased slightly, relative to 
method 0, as shown in Fig. 1. The stability analysis indicates 
that the most promising algorithm would be one similar to 
method I. Method II should also be considered, however, 
insofar as it represents less of an ad hoc approximation than 
method I. 

IV. Results and Conclusions 

To illustrate the effect of the implicit methods, a subcritical 
laminar flow (Re = 5 x 10 3 , M a = 0.5) past a two-dimen- 
sional NACA 0012 symmetric airfoil at zero degrees incidence 
is calculated. A 192 x 48 cell “C” grid is used. A full multi- 
grid scheme is used with the solution on the coarsest grid 
initialized to freestream values. Plots of the convergence histo- 
ries for the different methods on the finest grid are shown in 
Fig. 2. Five levels of multigrid and local time stepping are used 
for the results. The error curves plotted are the normalized 
root mean square of the density residual. Using method 1 with 
multigrid, the solution converged to a steady state in approx- 
imately 35 work units, which is 20 multigrid cycles. One work 
unit is defined as the amount of work required to advance the 
solution one timestep on the finest grid level; for a simple V 
cycle, the strategy used here, each multigrid cycles requires 
approximately 1-2/3 work units. 

At a Courant number of 16, both methods I and II produce 
converged solutions as illustrated in Fig. 2. The asymptotic 
convergence of method I is somewhat better than that of 
method II. A converged solution is not attainable if viscous 
terms are neglected from the implicit factors (method 0) at 
Courant number of 16. However, the scheme does converge at 
the expense of a lower Courant number, hence, a slower 
convergence rate, as shown in Fig. 2. This demonstrates the 
importance of maintaining an implicit viscous contribution in 
the numerical scheme. 

The importance of including viscous contributions in the 
implicit operator has been demonstrated. Although several 
options are available for implicit inclusion, the addition of 
approximate terms to the existing operators (method I) seems 
the most effective. 


Stability Analysis 

Some insight into the stability properties of these schemes is 
obtained from a von Neumann (Fourier) analysis of an advec- 
tion-diffusion equation. This equation with fourth-order nu- 
merical dissipation is written 


du du du 
dt ^dx 


+ ceAx 3 — — + c«Ay 3 
dx* 


d 4 u 

dy 4 



(ID 


This equation serves as a model for the TLA of the Navier- 
Stokes equations. Substituting the Fourier term u” 
= G n eP* x e i P ,y and constructing an ADI scheme analogous to 
that described earlier leads to an equation for the growth 
factor G. In addition to the Courant number and artificial 
dissipation e, the numerical stability of the implicit viscous 
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PARALLEL BLOCK MULTIGRID SOLUTION OF THE COMPRESSIBLE 

NAVIER-STOKES EQUATIONS 
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Ithaca, New York 


I. Introduction 

An efficient parallel algorithm has been developed for 
the simulation of compressible, viscous flow over com- 
plex geometries. The algorithm has been implemented 
within the framework of Composite Block-Structured 
grids which simplifies the construction of grids and al- 
lows the solution of different governing equations on dif- 
ferent blocks according to the physical character of the 
flow; in subdomains where the flow is nearly inviscid, the 
Euler equations are solved, while in subdomains where 
viscous effects are important, the Navier-Stokes equations 
are used. Accuracy and geometric generality are obtained 
by using a finite-volume approximation, solving the equa- 
tions in conservation form and including adaptive dissipa- 
tion for the accurate capture of flow discontinuities. The 
efficiency of the algorithm is achieved by a combination 
of various factors: an implicit formulation suitable for 
highly-stretched grids, a diagonalization procedure that 
reduces the operation count significantly, and a multigrid 
algorithm for rapid convergence acceleration. 

II. Description of the Algorithm 

This algorithm is modeled on the algorithm for the so- 
lution of the Euler equations described in Refs. [I] and 
[2], and the extension of that algorithm for the implicit 
solution of the Navier-Stokes equations [3j. Since turn- 
around time is a major limiting factor in the design pro- 
cess, the code is implemented on a multiprocessor super- 
computer. The block-structured grid, in a natural way, 
allows the use of parallel processing with the solution on 
different blocks being computed concurrently. The imple- 
mentation of the Diagonal Implicit Multigrid algorithm 
for the Euler equations [1] on block-structured grids is 
described in Refs. [4] and [5]. Here, the method is further 
extended for the solution of the Navier-Stokes equations. 
At present, the scheme is implemented on an IBM 3090- 
600J, a six-processor shared memory architecture, using 
' Clustered FORTRAN [6], a parallel extension of FOR- 
TRAN. 
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III. Results 

The algorithm has been implemented in a computer 
code to calculate transonic flow problems in two- and 
three-dimensions. The two-dimensional results presented 
here are used in verifying the accuracy and efficiency of 
the block code by comparing it with a serial single block 
Navier-Stokes code [3]. Also, the parallel speedups are 
reported as a measure of the parallel efficiency of the al- 
gorithm. For all multiblock results presented, the Navier- 
Stokes equations are solved in blocks adjacent to solid 
surfaces and in wake regions, while the Euler equations 
are solved in far-field regions. 


CONVERGENCE HISTORY 
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WORK UNITS 


NACA 0012 AIRFOIL (12bUt.Horiz.[l.l.l],E-N-B 

Mach .500 Alpha .000 

Real .190E-01 CFL 0.00 

Ras2 .373E-10 Grid 192x40 

Work 697.09 Rate .9779 Nmeeh 4 


Figure 1: Residual convergence history - 12 blocks; 

Re = 5000.0, Mo, = 0.50, a = 0.0 

The case presented here is a subcritical laminar flow 
(Re = 5000, = 0.5) past a two-dimensional NACA 

0012 symmetric airfoil at zero degrees angle of attack. 
The calculations are performed using a thin-layer approx- 
imation on a 192 x 48 cell “C”-grid. For the block code, 


965 



a single block grid was artificially divided into either 8 or 
12 blocks. The convergence rate for the 12— block case is 
shown in Figure 1. Using four levels of multigrid in each 
block and local time stepping, the solution converged to a 
steady state (defined as when the force coefficients reach 
99% of their final value) in approximately 25 work units 
which corresponds to 16 multigrid cycles. One work unit 
is defined as the amount of work required to advance the 
solution one time step on the finest mesh level; for the 
strategy used here, each multigrid cycle requires approx- 
imately 1 2/3 work units. This is the same convergence 
rate obtained on a single block. 

As shown in Figure 2, the solution of mixed equation 
multiblock case agrees well with the solution of the sin- 
gle block case. The computed force coefficients agree to 
within 0.05%. 




NAOA 0012 AIRFOIL (lEblXHorU.[I.l.lJX-N-e 
Mach .500 Alpha .000 

Cl .0000 Cd .0220 Cm .0000 

arid 102x48 Work 000.34 Ral .00 

Figure 2: Surface Pressure Distribution Comparison 
Re = 5000.0, Moo — 0.50, a = 0.0 
Multiblock case (12 blocks) - * 

Single block case - □ 

The results of the parallel execution are presented in 
Table 1. A speedup is the ratio of the elapsed time 
required for a serial execution of the algorithm to that 
required for a parallel execution. System overhead for 
executing a parallel job is considered in the calculation 
of theoretical speedups. The actual speedups agree well 
with the theoretical speedups except when there is a load [6] 
imbalance, which can be attributed to blocks of different 


# 

of 

proc’s 

8 Blocks 

12 blocks 

Theoretical 

Speedup 

Actual 

Speedup 

Theoretical 

Speedup 

Actual 

Speedup 

1 

0.97 

0.97 

0.97 

0~97~~ 

2 

1.89 

1.89 

1.86 

L86 

3 

2.77 

2.74 

2.78 

2.76 

4 

3.67 

2.71 

3.64 

3.59 ~ 

5 

4.46 

3.49 

4.50 

3.48 

6 

5.33 

3.47 

5.32 

5.16 


Table 1: Parallel speedups for 8 and 12 block configura- 
tions. 


size or noninteger block to processor ratios. 
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IMPLICIT MULTIGRID SOLUTION OF THE COMPRESSIBLE 
NAVIER-STOKES EQUATIONS WITH APPLICATION TO 
DISTRIBUTED PARALLEL PROCESSING 

Thomas Lee Tysinger, Ph.D. 

Cornell University 1992 


Efficient numerical procedures are developed for the solution of the Navier- 
Stokes equations. The Navier-Stokes equations are a system of conservation 
laws which govern the motion of compressible, viscous, heat-conducting flu- 
ids. A conservative finite volume formulation is used for spatial discretiza- 
tion of the governing equations, resulting in a system of ordinary differential 
equations. To advance the system in time, an Alternating Direction Implicit 
(ADI) procedure suitable for the Navier-Stokes equations is developed. The 
resulting implicit system is diagonalized to improve the computational effi- 
ciency of the scheme. Viscous contributions are added to the scheme implic- 
itly in a way that enhances the stability, yet does not disturb the efficiency of 
the algorithm. Rapid convergence to a steady state solution is achieved with 
a recursive multigrid algorithm. The stability and efficiency of the scheme 


are demonstrated with simulations of flow over wing sections. 

Furthermore, the algorithm has been implemented within the framework 
of multiple-block-structured grids in which the spatial domain is decom- 
posed into several blocks and the solution is advanced in parallel on the 
different blocks. Generic utilities have been developed to implement such 
a scheme in distributed computing environments. The multiple-block al- 
gorithm is designed so that the explicit residual calculation is identical to 
that of the single-block scheme, and therefore converged solutions for both 
schemes must be the same. To accelerate convergence, horizontal, vertical, 
and asynchronous multigrid algorithms are tested. Significant speedups have 
been achieved in a multiple processor environment, while convergence rates 
similar to those of the single-block scheme are observed. 
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Distributed Parallel Processing Applied to an Implicit 
Multigrid Euler /Navier- Stokes Algorithm 
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and 
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Abstract 

An implicit multigrid algorithm for the solution of 
the Euler and Navier-Stokes Equations has been im- 
plemented within the framework of multiple block- 
structured grids in which the physical domain is spa- 
tially decomposed into several blocks and the solution 
is advanced in parallel on each block. Utilities have 
been developed to implement such a scheme in a dis- 
tributed computing environment. The multi-block al- 
gorithm is designed so that the explicit residual calcu- 
lation is identical to that of the single-block scheme, 
and therefore converged solutions for both schemes 
must be the same. To accelerate convergence, syn- 
chronous and asynchronous multigrid strategies are im- 
plemented. Significant speedups have been achieved in 
a multiple processor environment, while convergence 
rates similar to those of the single-block scheme are 
observed. 

1 Introduction 

In a numerical simulation of a physical process, it is 
important to construct a model which accurately re- 
flects the physics of the problem. Also of importance 
is the numerical efficiency of the method, especially if 
the model is to be used repeatedly in a design process. 
Numerical efficiency for the solution of the Euler and 
Navier-Stokes equations was achieved with the devel- 
opment of a diagonalized implicit finite volume method 
within the framework of a multigrid algorithm 
With the recent advances in computer architecture - 
specifically the availability of low-cost high-speed com- 
puter workstations, the development of multiple pro- 

* Research Scientist. Member AIAA. 

1 Professor, Sibley School of Mechanical and Aerospace Engi- 
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cessor compute engines, and the interconnectivity af- 
forded by high-speed computer networks - it has be- 
come attractive to modify numerical algorithms so that 
they are amenable to parallel processing. 

Here, the algorithm developed by Caughey * for solv- 
ing the Euler equations of inviscid compressible flow 
and Tysinger and Caughey ^ for solving the Navier- 
Stokes equations of viscous compressible flow has been 
modified to work in a distributed computing environ- 
ment. A distributed computing system consists of a 
set of processing units and their associated memory 
linked together by a communications network. In the 
algorithm presented, the spatial domain is decomposed 
into multiple structured grids or blocks to allow for the 
treatment of complicated geometry. As an example; 
when computing the flow about an aircraft, the var- 
ious blocks might correspond to the domains in the 
vicinity of the wing, fuselage, nacelle, etc. In the multi- 
block scheme, a separate instance of the flow solver is 
used to compute the solution in each block. In addi- 
tion to providing a natural model for parallelization, 
where the solution is advanced in parallel on the dif- 
ferent blocks, a multi-block scheme allows the solu- 
tion procedure for individual blocks to be customized. 
For example, when computing flow near solid walls, it 
is important to model the viscous stresses accurately. 
In such regions, the Navier-Stokes equations must be 
solved. In the far-field however, where viscous contri- 
butions are small, the Euler equations may be used, 
thereby reducing the overall amount of computation 
per iteration step. 

Yadlin, Tysinger, and Caughey ^ describe a simi- 
lar scheme implemented on a shared memory multi- 
ple processor IBM ES/3090 600J supercomputer us- 
ing Parallel FORTRAN ® , a version of the FORTRAN 
language with parallel extensions. That work was in- 
spired by the success of Yadlin and Caughey ^ in 
the development of an implicit multigrid scheme for 
the Euler equations using block-structured grids on a 
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shared memory architecture. The present implementa- 
tion uses a distributed message passing system devel- 
oped for distributing large scale computations over a 
loosely coupled network of independent workstations. 

2 Distributed System 

Several message passing routines for inter-processor 
communication have been developed to facilitate the 
implementation of the distributed multi-block scheme. 
Much of the system is built on routines which were 
originally developed by the authors for interactive vi- 
sualization of supercomputer computations on graphics 
workstations, while later developments were inspired 
by a research project developed by Gounares ® at Los 
Alamos National Laboratory. Reliability considera- 
tions such as fault-tolerance are not addressed by this 
system. Such important issues have been addressed by 
more mature distributed systems such as the ISIS sys- 
tem developed by Birman at Cornell University (see 
for example ®). The goal here is to develop a library 
of high-level routines which contains little overhead, 
and yet still provides a reliable means of communica- 
tion which is both useful in a research environment 
and simple to set up. Like similar, but more sophis- 
ticated systems PVM ^ and Parasoft Corporation’s 
Express, the routines are portable across differing ar- 
chitectures and allow for heterogeneous computation. 
The system uses communication services provided un- 
der UNIX, which are available with most high perfor- 
mance computers currently available. However, since 
the details of the underlying communication primitives 
remain hidden from the application, the system may 
be built with other communications primitives, such as 
those provided with many dedicated parallel machines, 
without altering the application. 

The system is based on a coupled set of computing 
elements. A computing element may be thought of as 
a virtual processing unit and its associated memory; 
depending on the system used, two or more computing 
elements may share the same physical processor. Each 
computing element contains one or more communica- 
tion links, each of which is uniquely identified. Initially, 
each computing element knows its communications link 
identifiers, those of its neighbors, its internet address, 
and the address of a global server. It does not initially 
know the address of neighboring computing elements 
(or any other computing element). 

The distributed system begins with each comput- 
ing element sending a message to the global server, in- 
forming it of both its address and communication link 
identifiers. The message passing in the initialization 
phase is based on datagrams which provide a proto- 
col for passing small packets of data. Once the global 
server has received a message from each of the comput- 


ing elements, it returns a message to each computing 
element informing it of its neighbors’ addresses. The 
global server also initializes the distributed system for 
either synchronous or asynchronous communication to 
provide flexibility in the design of the algorithm . 

After initializing the system, all the necessary com- 
munication links between the computing elements have 
been established, and reliable data streams based on 
UNIX sockets are used for communication. Blocking* , 
non-blocking, and indeterminate message passing li- 
brary calls which access the lower level UNIX calls are 
provided for the application. The blocking calls are 
guaranteed not to return until all requested data has 
been received, providing synchronous communication. 
Non-blocking calls return immediately, even if the data 
is not yet available, allowing asynchronous communi- 
cation, and indeterminate calls are either blocking or 
non-blocking depending on the initialization of the dis- 
tributed system. While true that blocking and non- 
blocking calls would be sufficient for both modes of 
communication, the third indeterminate mode is pro- 
vided since its low-level implementation requires less 
overhead than the other routines. 

Fig. 1 illustrates a typical distributed system. That 
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Fig. 1 Computer Model 

system is comprised of three computing elements la- 
beled CE 1, CE 2, and CE 3. The writer processes 
are separate processes connected to the communication 
links of the computing elements. They are connected to 
the computing elements via a pipe and are there to fa- 

* Here the term block indicates a communication wait slate 
and is not to be contused with the term when used in the context 
of multiple t/oct struct ured-grids 
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Fig. 2 Writer Module 


cilitate non-blocking communication and to avoid data 
size restrictions of UNIX sockets. They are used only 
if the low-level communications buffers are too small 
for the amount of data to be sent, thereby avoiding a 
potential deadlock situation. The writer process con- 
sists of an expandable buffer, as shown in Fig. 2, which 
can grow to the memory limitations of the machine. 
That process is spawned by the distributed system if 
needed, and simply reads and writes data to and from 
its buffer when signaled. This allows a large amount of 
data to be transferred without first segmenting it into 
smaller chunks. A graphics filter could also be added 
to the distributed system as an additional computing 
element for interactive visualization as shown in Fig. 1. 

3 Autonomous Multi-Block Al- 
gorithm 

Using the distributed system described above, an 
Euler/Navier-Stokes solver is modified to work in a dis- 
tributed environment. The distributed system allows 
for the concurrent computation of multiple physical do- 
mains. Each physical domain, which is referred to as 
a block, can be mapped onto a computational domain 
with its own coordinate system. With the exception of 
the transfer of information between blocks which share 
a common interface boundary, the computation in each 
block is designed to be autonomous in that its execu- 
tion is independent of the execution of other blocks; 
when implemented in a computer code, each block re- 
quires a separate instance of the flow solver code. 

To accelerate convergence, three multigrid strate- 
gies - synchronous horizontal , asynchronous , and syn- 
chronous vertical - are tested. In the synchronous 
horizontal mode, interface information is updated on 
each level of the multigrid cycle, maintaining synchro- 
nization among the multiple, blocks at each level. The 
asynchronous mode allows the coarse-grid calculations 
within each block to continue even if interface data is 
not available; however, synchronization is imposed on 
th«*finest multigrid level. Finally, in the synchronous 
vertical mode, interface information for all multigrid 
levels is updated only after the finest multigrid level is 


reached. Parallel speedups are compared for the three 
modes. 

3.1 Domain Decomposition Method 

The distributed version of the algorithm uses multiple 
layers of ghost cells to transfer the solution between 
blocks with common interface boundaries. The trans- 
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Fig. 3 Block Decomposition of Computa- 
tional Domain 

fer of information is designed so that the explicit resid- 
ual calculation for the multi-block scheme is identical 
to that of the single block scheme, and therefore con- 
verged solutions for both schemes must be the same. 
To ensure numerical stability, an adaptive blend of sec- 
ond and fourth differences of the solution variables is 
added to the scheme. The fact that the fourth differ- 
ence numerical dissipation has a five point central dif- 
ference stencil dictates that there be at least two layers 
of ghost cells. An additional layer is needed because oP 
a three point pressure switch in the dissipation coeffi- 
cients, for a total of three layers of ghost cells at an in- 
terface boundary. Fig. 3 shows a typical decomposition 
into three blocks of the computational domain of a C- 
grid , such as that in Fig. 4. Block 2, including its ghost 
cells, is highlighted in the figure. The ghost cells extend 
into the interior of the neighboring blocks, blocks 1 and 
3. In the multi-block algorithm, in addition to interior 
cells, three types of boundary cells can be identified: 
physical boundary cells, interface boundary cells, and 
corner boundary cells. Physical boundary cells consist 
of only one layer of ghost cells, while interface bound- 
aries require three layers for the form of dissipation 
employed, as shown in Fig. 5. Also shown in that fig- 
ure are the necessary mesh coordinate points. Mesh 
points in the interface cells are used for the calculation 
of volume and metrics shown in Fig. 6, which are used 
in the calculation of the dissipation in those celts. Not 
having interface coordinate points would require either 
some approximation be made for the interface cell dis- 


3 





sipation, or that the values of the dissipation be trans- 
ferred from the neigltboring block, requiring additional 
communication costs. The values of the interface coor- 
dinate points are determined during the initialization 
phase of the distributed system when the values are 
transferred from the interior of the neighboring block 
to which they correspond. 



Fig. 4 Block Decomposition of Physical Do- 
main (C-Grid) 




Mesh Coordinate Point 
Interface Coordinate Point 

Physical Boundary Cell 
Interface Boundary Cell 
Corner Boundary 
Interior Cell 


interface boundaries. At an interface boundary, solu- 
tion information is transferred from the neighboring 
block. Unlike a shared memory system in which the 
transfer of information involves only a direct memory 
fetch, a distributed message passing system such as the 
one used here requires, in general, that data be sent 
over a communications network. Since networks are 
typically the root of the most severe performance limi- 
tations in parallel computing, it is important that com- 
munications costs be minimized. Therefore, only the 



Fig. 6 Interface Cell Storage 

solution information in the neighboring block which is 
necessary to fill the corresponding interface boundary 
ghost cells is transferred. It would not be prudent, in 
terms of communications costs, to transfer the entire 
solution of a neighboring block. 

Implicit boundaries on block interfaces are treated 
in a manner consistent with characteristic theory. No 
inter-block communication is required for such a treat- 
ment. While it would be possible to construct implicit 
interface boundaries so as to ensure an exact correspon- 
dence with a single block scheme, further communica- 
tions and synchronization costs would be incurred. The 
approximation for implicit boundaries cannot affect a 
converged solution; it can alter only the intermediate 
stages in the convergence process. 


Fig. 5 Cell Classification 


3.2 Application Model 


All communication costs, except for those in the ini- 
tialization phase, occur in the explicit treatment of the 


The multi-block algorithm is implemented on a 
distributed-memory message passing system. Each 
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block may be thought of as an autonomous unit 
linked to neighboring blocks by its interface bound- 
aries though some communications channels, such as 
the example in Fig. 7. The details of the underlying 
communications systems remain hidden from the ap- 
plication. Implemented with the message passing rou- 
tines described earlier, the underlying structure of each 
block module consists of a virtual computing element 
and the optional writers, as illustrated in Fig. 8. Mul- 
tiple block modules, when linked together, form the 
distributed system shown earlier in Fig. 1. 



Communications 

Channel 


Fig. 7 Application Model 



Fig. 8 Block Module 


3.3 Multigrid 

The multigrid algorithm within each block is the same 
as that of the single block algorithm with the additional 
complication of interface boundary treatment on coarse 
grid levels. The three strategies presented are similar 
to those described by Yadlin and Caughey '*■ ®. 



0 Update Boundaries 

V Restrict Solution and Residual 

A Prolong Corrections 

Write Interface Data 

Read Interface Data 
(block until available) 

Fig. 9 Synchronous Horizontal Mode, 5 Level 
Sawtooth Cycle 

In the synchronous horizontal mode, interface 
boundary data is written and read at specified loca--' 
tions within the multigrid cycle on all multigrid levels, 
as illustrated with the sawtooth cycle in Fig. 9. The 
read call will wait until all interface data from neigh- 
boring blocks has been received. Thus, the multigrid 
cycle across all multiple blocks will be synchronized at 
each multigrid level as shown in Fig. 10. 

In the synchronous vertical mode, interface data is 
written to neighboring blocks on all coarse levels, but 
is not read until the finest multigrid level is reached. 
This has the advantage that the calculation within each 
block can proceed while the interface data is trans- 
ferred through the network until the finest multigrid 
level is reached. The communication for the syn- 
chronous vertical mode is illustrated in Fig. 11. All syn- 
chronization for this mode occurs at the finest multi- 
grid level. Since the communication pattern is fixed, 
in that all writes and reads occur at a predetermined 
times within the multigrid cycle, this mode is labeled 
synchronous. 

In the asynchronous mode as illustrated by Fig. 12, 
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Fig. 10 Horizontal Mode Connectivity, 5 Level Sawtooth Cycle 



0 Advance Solution 

f~| Update Boundaries 

V Restrict Solution and Residual 

A Prolong Corrections 

Write Interface Data 

Read Interface Data 
(block until available) 

Fig. 11 Synchronous Vertical Mode, 5 Level 
Sawtooth Cycle 


interface data is written at predetermined locations 
within the multigrid level, but interface data is read 
only after it traverses the network and arrives at the 
neighboring block. A read call will not block until the 
finest multigrid level is reached. Data which arrives 
after it has been requested will be queued until it is 
read during the next read request. If the network is 
extremely slow relative to the speed of the computa- 
tion, this mode behaves like the synchronous vertical 
mo4s. If the network is fast relative to the computa- 
tion, then it behaves more nearly as the synchronous 
horizontal mode. 



f~| Update Boundaries 
V Restrict Solution and Residual 
A Prolong Corrections 
[£> Write Interlace Data 

<3 Read Interface Data J 

(block until available) 

<3 Read Interface Data 

(proceed even if not available) 

Fig. 12 Asynchronous Mode, 5 Level Saw- 
tooth Cycle 

4 Computations 

To demonstrate the efficiency and viability of the dis- 
tributed system, the algorithm described has been im- 
plemented in a computer code for solving the Euler 
and Navier-Stokes equations and several comparisons 
between the multi-block and single block solver have 
been carried out. All test cases presented have been 
carried out on a two-dimensional N ACA 0012 symmet- 
ric wing section. To compare the properties of the 
different multigrid modes, the code has been imple- 
mented on a distributed system consisting of high speed 


RISC-based (Reduced Instruction Set Chip) worksta- 
tions connected by a fiber-optic network. 



Fig. 13 Convergence History of Single Block 
Case, Moo = 0.800, ct = 1.25 


The first case presented is that of inviscid tran- 
sonic flow at free stream Mach number 0.8 past a two- 
dimensional NACA 0012 symmetric airfoil at a positive 
angle of incidence. The case is computed on a 192 x 32 
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Fig. 14 Convergence History of Multi- 
Block Case, Synchronous Horizontal Mode, 
Moo = 0.800, a = 1.25 

cell “C”-grid, and for the multi-block case is broken 
inter six blocks of size 32 x 32 cells each. The distri- 
bution of the blocks is such that four of the blocks lie 



Fig. 15 Convergence History of Multi- 
Block Case, Synchronous Vertical Mode, 
Moo = 0.800, a = 1.25 


on the airfoil surface while the remaining two lie in the 
wake region, downstream of the trailing edge. Conver- 
gence rates are compared for a single block case and 
both synchronous horizontal and synchronous vertical 
mode multi-block cases, and are shown in Figs. 13 - 15. 
In each case, a full multigrid scheme is used with the 
solution on the coarsest grid initialized to free stream 
values. A four level sawtooth multigrid cycle is used 
in the final grid sequence. The residual error curves 



MX* -OiOD Alptu - 1JLS0 Re -OjOwOO 
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Fig. 16 Surface Pressure Distribution of Sin- 
gle Block Case, Moo = 0.800, a = 1.25 

of the multi-block cases are the maximum value over 
each block of root mean square of the density residual. 
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The plots show that the convergence rates do not differ 
significantly between the single block case and the two 
multi-block cases. In each case, the residual has been 



Fig. 17 Surface Pressure Distribution 
of Multi-Block Case, Synchronous Horizontal 
Mode, Moo = 0.800, a = 1.25 

reduced by approximately 6.5 orders of magnitude in 
200 work units, where one work unit is defined as the 
amount of work required to advance the solution one 
time step on the finest grid level. The solution as 



Fig. 18 Surface Pressure Distribution of 
Multi-Block Case, Synchronous Vertical Mode, 
Moo = 0.800, a = 1.25 

represented by plots of surface pressure distribution in 
Figs. 16 - 18, shows the strength and location of the 
shockT'and the values of the force coefficients, to be the 
same for all three cases. 



Fig. 19 Mach Number Contours of Single 
Block Case, M TO = 0.875, a = 0.0 



Fig. 20 Mach Number Contours of Multi- 
Block Case, Synchronous Horizontal Mode, 
Moo — 0.875, a = 0.0 
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Fig. 21 Mach Number Contours of 
Multi-Block Case, Synchronous Vertical Mode, 
Mqo = 0.875, a = 0.0 

The next case presented is that of inviscid transonic 
flow at free stream Mach number 0.875 past a two- 
dimensional NACA 0012 symmetric airfoil at zero an- 
gle of incidence. The case is computed on the same 
192 x 32 cell “C”-grid as in the previous example, and 
is broken into six blocks of size 32 x 32 cells each for 
the multi-block cases. This case is chosen because of 
the strong shock near the trailing edge of the airfoil 
which crosses interface block boundaries and so tests 
the treatment of the artificial dissipation across the in- 
terface boundaries in the vicinity of strong gradients. 
Since the stencil for the treatment of dissipation across 
an interface boundary is the same as for that in the in- 
terior field, the solution of the multi-block solver should 
be identical to that of the single block solver once it has 
converged to its steady state. Contours of constant 
Mach number are presented in Figs. 19 - 21 for a sin- 
gle block case and multi-block cases using synchronous 
horizontal and synchronous vertical multigrid modes. 

As shown in those figures, even with a strong shock 
crossing the interface boundaries, neither of the multi- 
block solutions differs from the single block solution. 
In addition, the drag coefficients for all three cases are 
the same. 

To demonstrate the versatility of the multi-block 
scheme, a laminar viscous flow is computed at free 
stream Mach number 0.5 past a NACA 0012 airfoil at 
zero angle of attack. The computation is performed on 
a 192x49 cell “C”-grid which is partitioned into twelve 
blocks of size 32 x 24 cells for the multi-block case. The 
blocks are distributed such that four lie along the air- 



Fig. 22 X-Momentum Contours of Single 
Block Navier-Stokes Case, M TO = 0.5, ct = 0 

foil surface, two along the periodic cut downstream of 
the trailing edge, and the remaining six in the fen-field. 
The solution for the single block and multi-block cases 
are shown in Figs. 22 and 23 and are found to be in 
exact agreement. 

A full multigrid scheme is used with the solution on - 
the coarsest grid initialized to free stream values. A 
four level sawtooth cycle is used in the final grid se- 
quence; For the single block case, the density residual 
error is reduced by 4.3 orders of magnitude in 150 work 
units. The convergence rate for the multi-block case is 
nearly identical, differing by only 0.04%. 

Next, the same case is computed solving the Eu- V 
ler equations in the six far-field blocks rather than the 
Navier-Stokes equations. The Navier-Stokes equations 
are solved in the near-field blocks where viscous ef- 
fects are important. The solution shown in Fig. 24 
is found to be in close agreement with the previous 
cases, with the drag coefficient differing by 0.1%, and 
a convergence rate likewise nearly identical to the single 
block case. Using the synchronous horizontal multigrid 
mode, the multi-block calculation was performed on 
twelve identical workstations linked by ethernet, but 
achieved only a 3.5 performance gain over the single 
block calculation. 

To test the parallel performance of the different 
multigrid modes, tests were run on a distributed system 
consisting of five IBM RS/6000-530 RISC workstations 
connected in a ring network by an IBM serial optical 
channel. A speedup is defined as the ratio of the wall 
clock time required to process a solution on a single 
processor to the wall clock time required on multiple 
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Fig. 23 X- Momentum Contours of Multi- 

Block Navier-Stokes Case, M^, = 0.5, a = 0 

processors on an unloaded system. The theoretically 
maximum speedup is the number of processors used 
in parallel for a particular solution. Rarely in prac- 
tice is that ideal reached both because of algorithmic 
overhead in the parallel algorithm and communications 
delays among the processors. 

For this test, a 320 x 64 cell mesh was decomposed 
into five 64 x 64 cell blocks so that each of the five 
processors would have one-fifth of the original mesh 
in the parallel computations. An Euler calculation 
was performed in each block. The theoretical bound 
on the speedup is 5, assuming each processor per- 
forms the same operations, since each block is of equal 
size. The results of several tests are shown in Fig. 25, 
which shows the speedup verses the number of multi- 
grid levels. 

The number of communication calls increases lin- 
early with the number of multigrid levels, and the syn- 
chronous horizontal mode suffers the greatest penalty 
because of this. Recall that in the synchronous hori- 
zontal mode, interface information is updated on each 
level of the multigrid cycle. Thus, on every multigrid 
level, each processor must wait until it receives data 
from neighboring blocks until continuing with the cal- 
culation. With five levels of multigrid, the synchronous 
horizontal mode speedup was only 2.83 or 57% of the 
theoretical maximum 5. This is consistent with the 
performance observed in the Navier-Stokes calculation 
described earlier. The performance of both the asyn- 
chronous mode and the synchronous vertical mode were 
mucbnmproved. The asynchronous mode reached a 
speedup of 4.29 and the synchronous vertical mode 



Fig. 24 X-Momentum Contours of 
Multi-Block Hybrid Euler/Navier-Stokes Case, 
Moo = 0.5, a = 0 


a speedup of 4.59 with five multigrid levels. Much 
of this improvement stems from not waiting for data 
from neighboring blocks on coarse multigrid levels. The 
asynchronous mode continues with its calculation until 
interface data from neighboring blocks arrives and only 
afterwards are coarse level interface boundaries up- 
dated. The synchronous vertical mode behaves in much 
the same way, except it ignores all incoming coarse level 
data until the calculation reaches the finest level in the 
multigrid sequence. Note that without multigrid, the 
modes are algorithmically equivalent to each other, and 
each recorded a speedup of 4.95 or 99% of the maxi- a 
mum possible. 



Fig. 25 IBM Optical Ring Speedups 
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5 Summary 


References 


To allow for the treatment of complicated geome- 
try and to take advantage of parallel and distributed 
computing systems, an Euler/Navier^Stokes solver has 
been extended to allow for computation on a domain 
separated into multiple blocks. In the multi-block 
scheme, a separate instance of the flow solver is used 
to compute the solution in each block. 

Except for the treatment of interface boundary con- 
ditions which are a necessary part of the multi-block 
strategy, the algorithm is identical to the single block 
algorithm. To allow neighboring blocks to commu- 
nicate across their interface boundaries, a library of 
communication utilities has been developed. The al- 
gorithm has been constructed so that a converged flow 
field will be identical to the solution obtained with the 
single block algorithm. This is shown to be the case 
in three examples presented, even when strong gra- 
dients cross interface boundaries. Convergence rates 
between the single and multi-block schemes are nearly 
the same, using both synchronous horizontal and syn- 
chronous vertical multigrid modes in the multi-block 
scheme. 

The multi-block algorithm has the added advantage 
over the single block scheme in that it can readily 
be used in a parallel computing environment. Several 
tests were run on a coarse-grained distributed system 
to compare three multigrid modes - synchronous hor- 
izontal, asynchronous, and synchronous vertical - for 
the update of coarse level boundary conditions. Com- 
munications requirements restricted the synchronous 
horizontal mode speedups. The problems were largely 
resolved by using either the synchronous vertical mode 
or the asynchronous mode, where favorable speedups 
were observed, even when several multigrid levels were 
used. 
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Abstract 

A Multigrid Alternating Direction Implicit scheme has 
been developed to solve the two-dimensional Thin 
Layer Navier-Stokes equations for compressible turbu- 
lent flows, incorporating a simple algebraic turbulence 
model. Spatial discretization of the governing equa^- 
tions is done using a finite volume approximation to 
provide flexibility in dealing with complicated geome- 
tries. In order to maintain stability and robustness, 
artificial dissipation is added in the form of an adaptive 
blend of second and fourth differences of the solution. 
The time-linearized implicit operator is approximated 
as the product of two one-dimensional factors, each of 
which is diagonalized using a local similarity transfor- 
mation for computational efficiency. The viscous terms 
are treated explicitly to maintain the diagonal form. 
The implicit scheme is used within the framework of 
the multigrid method to further accelerate convergence 
to a steady state. Results are presented for transonic 
flows past airfoils. Flow field results, including bound- 
ary layer quantities, are presented and compared with 
other computational data and experiments to confirm 
the accuracy of the method. Comparisons of conver- 
gence rates are made to demonstrate the efficiency of 
the implicit ADI multigrid method. 


1 Introduction 

Much progress has been made in recent years in de- 
veloping efficient algorithms to solve the Navier-Stokes 
equations for complex geometries. However, if they are 
to be used on a regular basis for design purposes in the 
aircraft industry, the algorithms have to be made more 
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Copyright ©1991 by the American Institute of Aeronautics 
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efficient and accurate. This is particularly important 
for high Reynolds number flows, where the boundary 
layers are very thin and where shocks may cause signifi- 
cant boundary layer separation. It is generally accepted 
that the biggest stumbling block to obtaining physically 
correct solutions for such flows is the absence of good 
turbulence models [1]. The incorporation of better tur- 
bulence models leads to an increase in computing power 
requirements. So alongside the drive to develop faster 
computers is the drive to develop more efficient algo- 
rithms. 

The Multigrid Diagonal Implicit (MDI) algorithm of 
Caughey [2] has proved very efficient in solving the Euler 
equations. The implicit nature of the algorithm makes 
it particularly attractive for solving the Navier-Stokes 
equations for high Reynolds number flows where the 
large cell aspect ratios, required to resolve the extremely 
thin boundary layers, make stability a problem with 
explicit algorithms. The diagonalization of the equa- 
tions and the implementation of the algorithm within 
the framework of multigrid make the scheme very effi- 
cient for the Euler equations. This scheme is extended 
here to solve the Thin Layer Navier-Stokes equations for 
turbulent transonic flows over airfoils. First, the govern- 
ing equations and the numerical scheme are described, 
and then results including boundary layer quantities are 
presented for a variety of flows. 


2 Governing Equations 

2.1 Navier-Stokes Equations 

The Navier-Stokes equations describe the motion of a 
viscous fluid. In Cartesian coordinates, the fully con- 
servative form of the unsteady two-dimensional Navier- 
Stokes equations for a compressible fluid, neglecting 
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body forces and heat sources, is 

dw df dg df v dg v 

dt dx dy dx dy ’ 

where 

w = {p, pu, pv, e} T 
is the vector of conserved variables, 

/ = {pv, pu 2 +P, puv, (e+p)u} T , 

9 = {pv, puv , pv 2 +p, (e+p)v} T , 

are the inviscid flux vectors in the x and y directions 
respectively, and 

ft! — {0, <T rr , <?xy , <+>x } , 

9v = {0, &xy, Vyy, 4>y } . (4) 

are the viscous flux vectors in the x and y directions, 

respectively. The variables p and p are the fluid density 

and pressure, u and v are the velocity components in the 
x and y directions, and e is the total energy per unit 
volume. The equation of state for a caloricaliy perfect 
gas is used to relate the pressure to the total energy 

P = (7-l)|e~p U 2 ~ }’ ( 5 ) 

where 7 is the ratio of specific heats. For air, 7 = 1.4 . 

For isotropic Newtonian fluids the viscous stresses are 
given by 


( 1 ) 

( 2 ) 

(3) 


Vxx 

<T x y 

Vyy 


„ du du dv 
2 ^3x + A 3x + 3y ’ 
du dv 

_ dv du dv. 


( 6 ) 


Stokes’ hypothesis, 


A = 


-!p> 


(7) 


relates the factor A to the molecular viscosity p of the 
fluid. The energy fluxes, <j> x and <j> y , are given by 


— VCTxx ~h VtTxy 9x, 

4*y — UCTxy "b V(Tyy — <Jy . 

where 



,ar 

9i = 



, dT 

9y = 

dy 


( 8 ) 


(9) 


are the heat fluxes in the x and y directions. Here T 
is the temperature and k is the thermal conductivity of 
the fluid. 

The dependent variables in the governing equations 
(Eqs. 1) are instantaneous local quantities. For turbu- 
lent flows, the presence of fluctuations about the mean 
makes it prohibitively expensive to compute the instan- 
taneous values of the flow variables at large Reynolds 
numbers since the range of length and time scales which 
must be resolved is very large. It is more practi- 
cal to compute, instead, the mean field alone if suit- 
able closure models can be developed. Time-averaging 
the flow equations and writing them in terms of mass- 
averaged dependent variables lead to the Reynolds Av- 
eraged Navier-Stokes Equations for compressible turbu- 
lent flows [3]. These equations have the same form as 
the equations for instantaneous quantities (Eqs. 1) ex- 
cept that 

1. the viscous stresses (Eqs. 6 ) are augmented by the 
Reynolds stresses and 

2. the heat fluxes (Eqs. 9) are augmented by the anal- 
ogous heat fluxes due to the turbulence. 

The turbulence model which is used to estimate these 
additional quantities is described in the next section. 


2.2 Turbulence Model 

The effects of turbulence are modeled using the eddy 
diffusivity concept for the Reynolds stresses and eddy 
thermal conductivity for the turbulent heat fluxes. The 
total diffusivities are given by 

p = Pmol + Pt, 

k = k mo i + kt, (10) 

where p mo i and k mo i are the molecular quantities, and 
Pt and k t are the turbulent quantities. We obtain clo- 
sure by modeling p t analytically using a zero equa- 
tion model and calculating k t from Pr t , the turbulent 
Prandtl number, which is chosen to be equal to 0.9 . 

The turbulence model is based on the algebraic model 
of Baldwin and Lomax [4]. This is a two-layer zero- 
equation eddy-viscosity model. The eddy viscosity p t is 
given by 

j {Pt)inncr V ^ Vcrossover \ 

\ {Pt) outer y A y crossover 

where y is the distance normal to the wall and y C rotsovcr 
is the smallest value of y at which the value of p t cal- 


culated from the inner formula exceeds the value from 
the outer formula. 

This model does not provide any rigorous means of as- 
certaining the location of transition. However, in this 
implementation of the model it is possible to specify the 
transition location and modify the values of p t accord- 
ing to 


Pt 


{ 


0 

(.Pt) Baldwin — Lomax 


X ^ X transition 
X ^ ^transition 


(12) 


The transition location x tr ansition can either be speci- 
fied directly or deduced on the basis of a specified cri- 
terion. 


The model is modified slightly when applied to the 
wake. In the wake the van Driest damping factor is 
set equal to 1, and the y-distance is measured from the 
first coordinate line in the wrap-around direction of the 
C-grid, i.e. from the tj = 0 line. 


/ ° \ 
Zx<7xx+Zy<7xy 
ZxVxy+ZyGyy 
\ Zxtx+S.y'Py / 

° ) 

Vx &xx “h ‘Hy &xy I 

r lx<7xy + l}yO'yy I 

\ > 7 x4>X + T]y<t>y / 


( 17 ) 


(18) 


are the transformed flux vectors. The contravariant 
components of the velocity are related to the Cartesian 
velocities by 


(v) = ' , ' , (“)' (19) 

where J is the Jacobian matrix of the transformation 
written as 

J = { Xf \ . (20) 

l % yr, J 


2.3 Coordinate Transformation 


The determinant k of the Jacobian J, which corresponds 
to the cell area, is given by 


To facilitate the handling of complex geometries in a 
finite volume formulation, the Navier-Stokes equations 
(Eqs. 1) are transformed from the Cartesian coordinate 
system or the physical plane into a generalized curvilin- 
ear coordinate system or the computational plane. This 
non-singular transformation has the following form: 


£ = £(*,!/); *7 = ri(x,y). (13) 


Under this transformation, an arbitrary quadrilateral 
cell in the physical plane becomes a unit cell in the com- 
putational plane. The transformed system of equations 
written in strong conservation form is 


d\V d£ dGv_ 

dt + d£ + dp d£ + drj 

where 

W = hw 


(14) 


is the transformed vector of dependent variables and 
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(16) 


h = x ( y n - y ( Xr,. (21) 


2.4 Thin-Layer Approximation 

The Thin Layer approximation [4] is a simplification 
to the Navier-Stokes equations. Under this approxima- 
tion, all diffusion processes in the streamwise direction 
are neglected. If the Reynolds number of the flow is 
large and if there is a dominant primary flow direction - 
i.e., if there are no regions of significant separation, then 
the diffusion in the streamwise direction is much smaller j 
than that in the normal direction. This forms the phys- 
ical basis for the approximation. In this respect it is 
similar to the Boundary Layer approximation. But it is 
more general in that no assumptions are made regarding 
the pressure, and the momentum equation normal to the 
body surface is retained. The proper implementation of 
the Thin Layer approximation in a numerical procedure 
requires a grid with a body- or stream-aligned coordi- 
nate. Under these conditions, the approximation can be 
made without adversely affecting the solution. In the 
present formulation the £ -coordinate is approximately 
parallel to, and the r \ -coordinate is approximately nor- 
mal to the airfoil surface. Therefore all £ -derivatives 
are neglected while all 77 -derivatives are retained in the 
viscous terms and in the evaluation of the Cartesian 
derivatives in the viscous terms. In particular, the vis- 
cous flux in the 77 -direction G v which involves the calcu- 
lation of Cartesian derivatives (Eqs. 6 and 9) is modified 
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to a simpler form G' v containing only 77 -derivatives: 
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G' v = h 


where 
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The system of equations then reduces to 
dW d£ dG _ dG' v 

dt + d£ + dr/ dr/ 


( 22 ) 


(23) 


(24) 


3 Numerical Method 


The Thin-Layer Approximation to the Navier-Stokes 
equations (Eqs. 24), along with boundary conditions 
which will be discussed later, is solved numerically. The 
equations are discretized in space using finite volumes 
and the solution is advanced in time using a diagonal 
Alternating Direction Implicit scheme. 

In a finite volume formulation, the physical domain is di- 
vided into a number of quadrilateral cells. The govern- 
ing differential equations (Eqs. 24) are integrated over 
the area of a cell and ate reduced to the following form 
using Green’s Theorem: 

= - / G'v d£ (25) 

JdA 

Here the integration is performed in the computational 
domain and hence A is the area of the cell in the com- 
putational plane and dA is the cell boundary. Cell av- 
eraged quantities are used for the dependent variables. 

This finite volume scheme applied to the Euler equa- 
tions does not contain any dissipative terms. In order to 


prevent odd-even point decoupling and oscillations near 
shock waves or stagnation points artificial dissipation 
terms must be added when solving the Euler equations. 
The Navier-Stokes equations on the other hand possess 
dissipative properties due to the presence of the viscous 
terms. However the physical dissipation provided by 
these terms in regions away from the shear layer may 
not be sufficient to guarantee stability. So in order to 
maintain the stability and robustness of the numerical 
procedure it is still necessary to add artificial dissipa- 
tion. The terms are constructed as an adaptive blend of 
second and fourth differences with the directional scal- 
ing of the terms suggested by Caughey [2], 

The modified set of equations in integral form is 

7,JJ a Wd < d ” + IJ ?*>-<&) 

= -/ G'v d£ + / (DfiZ dr/ — D n ui d£) (26) 

JdA JdA 

and in differential form is 


+ + dD + dD ** (07) 

dt + dr/ dr/ + d£ dr, 1 


where D^w and are the dissipative fluxes across 

cell faces in the £ and r/ directions. The differential 
operators Df and have the form 


D ( 

Dr, 


,(2) 




_ _ 

d£ de ’ 

eW— -(W — . 
n dr, n dr, 3 


(28) 


The coefficients of dissipation, and are defined 
following Caughey [2]. 

In computations of viscous flows, and turbulent flows in 
particular, it is important to monitor the effects of artifi- 
cial dissipation on the accuracy of the solution. Means 
of estimating the integrated effect of these terms are 
currently being studied. 


3.1 Iterative Scheme 

To construct the iterative scheme, the governing equa- 
tions are written in their differential form (Eq. 27). The 
spatial derivatives are approximated implicitly and the 
changes in the flux vectors are linearized in time. The 
implicit Operator thus obtained is approximated as the 
product of two one-dimensional factors. The develop- 
ment thus far follows that of Briley and McDonald [5] 
and Beam and Warming [6]. Following Pulliam and 
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Chaussee [7] the two implicit factors are then diagonal- 
ized using local similarity transformations. 

All spatial derivatives except the viscous term are ap- 
proximated implicitly, i.e., as weighted averages of dif- 
ferences taken at the old and new time levels. Treating 
the viscous term implicitly presents problems because 
simultaneous diagonalization of the Euler and viscous 
flux Jacobians is not possible. For this reason the vis- 
cous term is kept purely explicit in this work, although 
recent work by Tysinger and Caughey for laminar flows 
[8] shows that including approximations to the contribu- 
tions of the viscous terms in the implicit factor improves 
the stability of the scheme. The artificial dissipation 
terms must be treated implicitly for good convergence 
and this can be done easily without destroying the di- 
agonal form. Using a forward difference to approximate 
the time derivative, the equations can be written as 


are the Jacobians of the flux vectors with respect to 
the solution. Introducing these linearizations into Equa- 
tion 29 yields a scheme of the form 

(/ + 0At[6 ( A?j + S.BPj - W.ofe) 

= R?j- (33) 


The operator on the left hand side of the above equation 
(Eq. 33) is approximated as the product of two one- 
dimensional factors to give 


(; + 8 At [m?j - M>?,,te)]) 


x -f- 9 At 
= Ki- 


6 n B?j ~ 6 n V. 




A WPj 


(34) 


AWPj + 9 At - F?j) + 6, (<5?+ l - G?j) 

= -A t[6 ( F?j + 6,01) ~ Sn&v tJ 

= (29) 


where AWPj is the correction vector defined as 

AWPj = Wpf 1 - WPj, 

and the parameter 9 determines the degree of implicit- 
ness in the scheme (0 < 9 < 1). The difference operators 
and V, have the form 


v ( = 4 2) * e -4 4) 6f, 

V, = (30) 


and R’jj is the residual vector corresponding to the 
cell (i,j) at time level n. 

The changes in the flux-vectors are linearized in time 
using a Taylor series expansion about time level n, 

Fpf 1 = F?j + A?jAWPj + O (At-) , (31) 

and 

Gif 1 = Gfj + B?jAW?j + O (At 2 ) , (32) 

where 



The error is making this approximation is O (At 2 ). The 
implicit treatment of the fourth order dissipation terms 
leads to the requirement that block pentadiagonal sys- 
tems be solved in each direction. For 2-D problems, each 
of the blocks is 4 x 4. Pulliam and Chaussee [7] showed 
that each of these systems of equations can be diagonal- 
ized, thereby greatly reducing the computational labor 
required to solve them. 

The approximately factored equations (Eqs. 34) are di- 
agonalized at each mesh point by a similarity transfor- 
mation. The modal matrices Qa and Qb diagonalize 
the Jacobian matrices A and B as follows 

Qa'AQa = Aa ; Q- b 1 BQ b = A*.' J (35) 

Here A* and A# are diagonal matrices whose diagonal 
elements are the eigenvalues of A and B respectively. 
The elements of the Jacobian matrices, their modal ma- 
trices and their eigenvalues are given by Pulliam and 
Chaussee [7]. 

Substituting Equation 35 into Equation 34 yields the 
decoupled set of equations 

{I + 9 At [A a .J< - 4^f(l/M 
+ 4 4) / ( 4 ( i // li , j )]}"g-i j 

X Q b?i {I + 0At[A Bi J„ - «gi|(l /hij) 

+ 4M(1 /hij)]) n AVfi 

= —At Qjl {9(Fi j + 6,Gij — 6,G' Vi j 

- 6(V{Wij - 6 v V v Wij) n . (36) 

This represents a set of four scalar pentadiagonal sys- 
tems. The error associated with the diagonalization can 
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be shown to be of O(At). The systems of equations 
(Eq. 36) are solved at each time step by first solving for 
intermediate corrections along lines of constant tj ( i.e 
constant j) and then solving along lines of constant £ 
(i.e., constant t) for the corrections A\/" to the charac- 
teristic variables of the linearized one-dimensional prob- 
lem in the rj-direction. The correction A W7*- to the 
solution in each computational cell is then given by 

AH/"- = Q b A !/■"■ . (37) 

3.2 Convergence Acceleration 

The convergence to a steady state of the solution to the 
difference equations (Eqs. 36) is accelerated using two 
techniques - multigrid and local time-stepping. 


Residuals are computed using the corrected values for 
the flow variables and a specified number of iterations 
is performed. Corrections obtained are interpolated to 
the next finer grid and this process is continued until 
the finest grid is reached. The cycle is repeated for a 
prescribed number of Work Units. A Work Unit is de- 
fined as the amount of computation required for one 
smoothing step on the finest mesh. 

Both the body-surface and the far-field boundary con- 
ditions are updated on coarser meshes. In our calcu- 
lations, a fixed V-cycle is used in which the solution is 
advanced one time step (i.e., one iteration) on each grid 
level as the grid is coarsened and refined. 


3.2.2 Local Time Stepping 


3.2.1 Multigrid Algorithm 

The diagonal ADI scheme described above has good 
high wave number damping characteristics and is there- 
fore a natural choice as a smoothing algorithm for elim- 
inating the high wave number errors at each level of a 
multigrid procedure. The multigrid strategy employed 
here follows that of Jameson [9] and Caughey [2]. 

A specified number of iterations is first performed on the 
finest grid. A coarser grid is then formed by eliminat- 
ing every second line of the fine grid in each coordinate 
direction. Each cell in the coarser grid therefore con- 
tains four adjoining fine grid cells. Restricted values of 
the flow variables in each coarser grid cell are obtained 
using area- weighted averages of the values in the four 
corresponding fine grid cells. Restricted values for the 
residuals in each coarser grid cell are obtained by sum- 
ming the residuals in the corresponding fine grid cells. 
Forcing functions are defined for each coarser grid cell as 
the difference between the restricted residuals and the 
values of residuals calculated using the restricted flow 
variables. The residuals used to drive the corrections on 
the coarser grid are then defined as sums of the residuals 
computed on the coarser grid using the current values of 
the flow variables and the forcing functions. The forc- 
ing functions ensure that the corrections on the coarser 
grid are driven essentially by the residuals computed on 
the fine grid. Corrections are computed on the coarser 
grid for a specified number of iterations. Then a still 
coarser grid is formed by the above procedure, and the 
process repeated until the prescribed coarsest grid is 
reached. Corrections are computed on the coarsest grid 
and they are prolonged back to the next finer grid using 
bilinear interpolation in the computational coordinates. 


The idea behind local time stepping is to take the largest 
possible time step in each individual cell. This destroys 
the time accuracy of the solution, but it does not affect 
the steady state solution. The size of the time step 
used in each cell depends on the dimensions of the cell. 
Directional time steps, based on the one-dimensional 
wave propagation in inviscid flow and on unit Courant 
number, are defined as: 


A = 

\U\ + a^J + Cy 

(38) 

A tri 

(39) 

\ v \ + a \/nZ + »7y 


The time step At corresponding to each 
defined as: 

cell is then 

At = CFL ( ) 

V Atf + A t v J 

(40) 

where CFL is the Courant number, which is 
the same for all cells. 

taken to be 


3.3 Boundary Conditions 

Boundary conditions are imposed numerically using a 
single layer of dummy cells along the entire boundary. 
Values for the dependent variables are assigned to these 
cells using explicit boundary conditions. The implicit 
nature of the algorithm requires that boundary condi- 
tions also be applied on the solution variables before 
solving the system of equations. 
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3.3.1 Explicit Boundary Conditions 


4 


Results 



II 


The solutions computed to date have been for subsonic 
freestream Mach numbers. The upstream boundary 
conditions in the farfield are based on the Riemann 
invariants for the one-dimensional problem normal to 
the boundary. Variables are either extrapolated from 
within the domain or set to freestream values depend- 
ing on the direction of propagation of the characteristic. 
For meshes used in the present work the farfield inflow 
boundary is a constant p line with £ increasing clock- 
wise. At subsonic inflow locations the first Riemann 
invariant 


Ri 


X(V - y ( u 1c 

■fihX T -‘ 


(41) 


is extrapolated from the interior of the domain, while 
the second Riemann invariant 




X(V - y^u 

\/ x \ + yl 


2c 
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(42) 


is specified using the freestream values. The entropy 
and the tangential velocity of the fluid at the boundary 
are set to freestream values. Using these conditions the 
values of the dependent variables are readily obtained. 

At a subsonic outflow boundary pu, pv and the en- 
tropy are extrapolated from the interior, while pressure 
is specified to be the freestream value. The density and 
energy are then calculated using these boundary values. 

On the body surface the no-slip boundary condition is 
applied, i.e., the velocity components u and v are set 
equal to zero. The surface is assumed to be adiabatic, 
i.e. dT/dn = 0. The high Reynolds number approx- 
imation that the normal pressure gradient, dp/dn, on 
the surface is zero is also used. 


The Multigrid Diagonal Implicit (MDI) scheme de- 
scribed in the previous section was coded and tested for 
laminar and turbulent transonic flows past two differ- 
ent airfoils - the NACA 0012 and RAE 2822 sections. 
The test cases studied included attached flow, mildly 
separated flow and flow with massive separation due to 
shock - boundary layer interaction. 

The scheme has been applied to compute transonic 
flows past airfoils when the Reynolds numbers are large 
enough that the boundary layer on the airfoil is turbu- 
lent. The following cases are presented here: 

1. NACA 0012 - M 0 o = 0.7, a = 1.49, Re, = 9 x 10 6 

2. NACA 0012 - Moo = 0.799, a = 2.26, Re c = 9xl0 6 

3. RAE 2822- Moo = 0.725, a = 2.92, Re c = 6.5 x 10 6 

4. RAE 2822 - — 0.75, a = 3.19, Re c = 6.2 x 10 6 

These are test cases used at the Viscous Transonic Air- 
foil Workshop of 1987 [1]. 

The computations for both the NACA 0012 and the 
RAE 2822 airfoils were done on C-grids generated by 
the GRAPE code [10], containing 192 x 48 cells in the 
wrap-around and body-normal directions respectively. 
Of the 192 cells in the wrap-around direction 120 or 
62.5% were on the airfoil surface for the NACA 0012. 
In the case of the RAE 2822 airfoil two-thirds of the 
cells were placed on the airfoil surface. The distance 
from the airfoil to the first coordinate line- was about 
5 x 10 -5 chords which corresponds to a y + less than 
4 for the given Reynolds numbers at most points on 
the airfoil. The farfield boundaries were about 7 chord 
lengths from the airfoil. The cells are highly clustered in 
the 77 -direction near the surface of the airfoil and have 
aspect ratios as large as 10 3 . The Prandtl number was 
chosen to be 1.0, and the turbulent Prandtl number is 
fixed at 0.9 for all computations. 




3.3.2 Implicit Boundary Conditions 


The implicit treatment of the far-field boundary condi- 
tions is straightforward; they are treated in a manner 
consistent with characteristic theory. These conditions 
are the same as those used in solving the Euler equations 
[2]. For Navier-Stokes calculations, the use of character- 
istic theory to determine the implicit boundary condi- 
tions on the body surface is inappropriate. However, ho- 
mogeneous Dirichlet conditions are found to work well 
for a wide range of problems. 


4.1 Flow over NACA 0012 Airfoil 

The NACA 0012 airfoil is a symmetric 12% thick airfoil 
which has been tested extensively both experimentally 
and computationally. The experiments used for com- 
parison here were conducted in the transonic tunnel at 
the NASA Langley Research Center by Harris [11]. In 
these experiments, the boundary layer was tripped us- 
ing carborundum strips at 5% of the chord. In all the 


493 


computations, the location of transition, x, ran „t, on , was 
fixed at 0.05 chords. Airfoil surface pressure distribu- 
tions for two cases are compared with the experiments 
performed by'Harris and the computational results from 
the VTA Workshop [1]. 


4.1.1 Case 1: NACA 0012 Airfoil at 

■Moo = 0.7, a = 1.49°, and Re c = 9 x 10 6 

For this case the flow is attached and just slightly super- 
sonic near the leading edge on the upper surface. The 
measured experimental angle of attack for this case was 
1.86°. This was corrected to 1.49° by Harris using a lin- 
ear method of accounting for wind tunnel wall effects. 

Figure 1 shows that the computed surface pressures are 
in excellent agreement with the experimental data. Ta- 
ble 1 gives a comparison of the force coefficients. The 
lift coefficient obtained using the present method is 
about 2% less than the experimental value and is within 
the range of values obtained computationally by others. 
Drag coefficients are difficult to calculate accurately be- 
cause the pressure integration for drag is very sensitive. 
Even so the value obtained 0.0084 is only about 6% 
larger than the experimental value, and is within the 
range of the VTA Workshop values. Of the total com- 
puted drag about 17% is due to skin friction and the 
rest is due to pressure drag. 

Convergence results are presented in Figures 2 and 3. 
Grid sequencing is used, i.e. , multigrid solutions are first 
obtained on coarser grids and then interpolated for use 
as initial conditions on finer grids. The error is defined 
as the absolute value of the residual of the continuity 
equation averaged over all the grid cells. The logarithm 
of the error is plotted versus the number of Work Units 
in Figure 2 for a single grid and for 4 levels of multi- 
grid. We see that with 4 levels of multigrid the error has 
been reduced 8 orders of magnitude in 500 Work Units, 
whereas for the single grid it has been reduced only 3 or- 
ders of magnitude. The asymptotic rate of convergence 
is clearly much improved with multigrid. The Courant 
number for both cases was 24, and local time-stepping 
w r as used. Figure 2 also compares the convergence his- 
tory of the MDI scheme presented in this paper with 
the explicit Runge-Kutta multigrid scheme of Martinelli 
and Jameson [12], The error as defined earlier is plotted 
as a function of Work Units. The results of Martinelli 
and Jameson have been converted to Work Units for 
this comparison; this is appropriate since one work Unit 
of our implicit scheme requires almost exactly the same 
amount of CPU time as a Work Unit of the explicit mul- 
tistage scheme of Martinelli and Jameson. The overall 


convergence rate and the asymptotic rate are improved 
with the present implicit scheme. Figure 3 shows that 
the three measures of global convergence - the lift co- 
efficient Cl, the drag coefficient Co, and the number 
of cells N }up in which the local velocity is supersonic, 
have converged to within plottable accuracy of their fi- 
nal values in less than 50 work units when using 4 levels 
of multigrid. 


4.1.2 Case 2: NACA 0012 Airfoil at 

Moo = 0.799,a = 2.26°, and Re c = 9 x 10 6 

Transition was again fixed at 0.05 chord. The flow 
field for this case contains a shock on the airfoil up- 
per surface at an x/c of about 0.5. The shock is strong 
enough to induce a significant boundary layer separa- 
tion. The experimental data obtained by Harris [11] 
are compared with the computational results in Fig- 
ure 4. The computational angle of attack (2.26°) is ob- 
tained from the measured angle of attack (2.86°) using 
a linear wind tunnel wall correction procedure [11]. Our 
results are generally in close agreement with other com- 
putational results that use the same turbulence model 
(illustrated here using the results of King [13]), but the 
shock strength and the shock position are incorrectly 
predicted relative to the experiment. This is also re- 
flected in the comparison of computed measured force 
coefficients shown in Table 1. The computed shock is 
both stronger and farther downstream than that mea- 
sured experimentally. The convergence history using 
grid sequencing and four levels of multigrid is shown in 
Figure 5. The rate is comparable to that obtained for 
the simpler Case 1. 


4.2 Flow over RAE 2822 Airfoil 

The experimental results of Harris on the NACA 0012 
contained no boundary layer or wake measurements. 
Hence, although the two test cases above showed that 
the MDI scheme was efficient and that pressure distri- 
butions could be reasonably well predicted, they did 
not provide any evidence on how well the turbulence 
model and its implementation within this scheme were 
able to predict flow properties of the airfoil boundary 
layer. To provide this information, comparisons were 
made between the computations using the MDI scheme, 
and experiments done on an RAE 2822 airfoil by Cook, 
McDonald and Firmin [14] at the Royal Airforce Estab- 
lishment in the U.K. This is a supercritical airfoil with a 
highly cambered aft portion. Measurements reported by 
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Cook et al. on this airfoil include surface pressure mea- 
surements, wake total and static pressures and bound- 
ary layer total and static pressures. From the data they 
were able to calculate the skin friction coefficient Cf, 
and the displacement and momentum thicknesses 6 and 
S' at various points on the surface. Transition to tur- 
bulence is effected using a strip of tiny glass spheres at 
3% of the chord. 


4.2.1 Case 3: RAE 2822 Airfoil at 

Mco = 0.725, a = 2.92°, and Re c = 6.5 x 10 6 

The airfoil surface pressure distribution is shown in Fig- 
ure 6. Clearly a correction has to be made to account 
for the wind-tunnel-wall interference. This is done by 
modifying the angle of attack to match the computed 
lift coefficient with the experimental value. At this cor- 
rected angle of attack the drag coefficient is also close 
to the measured value (Table 1). The surface pressure 
distribution at the corrected angle of attack of 2.5° is 
shown in Figure 7. The computed solution agrees well 
with experiment except near the shock. 

The measured and computed skin-friction coefficient 
distributions are shown in Figure 8. The oscillation near 
the leading edge is due to transition which was fixed 
at 0.03 chords. The computed results show no shock- 
induced separation and are in good agreement with the 
experimental values. This is remarkable considering the 
fact that it is a derivative quantity and is therefore more 
difficult to estimate accurately. 

The displacement thickness and the momentum thick- 
ness distributions on the upper surface of the airfoil are 
shown in Figure 9. The agreement with experiment is 
very good before the shock but is only fair after it. This 
is also reflected in the velocity profiles calculated at two 
chordwise locations on the airfoil upper surface. In Fig- 
ure 10, the computed velocity profile at x/c = 0.319 is 
compared with experimental data taken at two spanwise 
locations. In the same figure, the computed and experi- 
mentally obtained velocity profiles at x/c = .95 are also 
compared. They show that the boundary layer profile 
shape at a location before the shock ( x/c = 0.319) is 
reasonably well predicted while the profile shape at a 
location after the shock (x/c = .95) is rather poorly 
predicted. 

A plot of the convergence history is shown in Figure 11. 
Using four levels of multigrid and grid sequencing, the 
solution converged to a steady state in about 50 work 
units. The average residual has been reduced about 7 
orders of magnitude in 500 work units. 


4.2.2 Case 4: RAE 2822 Airfoil at 

Moo = 0.75, a = 3.19°, and Re c = 6.2 x 10 6 

This case is widely regarded as one of the most diffi- 
cult cases because the shock wave causes a significant 
amount of separation. The angle of attack was modified 
to 2.50® to match the lift coefficient. At this corrected 
angle of attack the drag coefficient is again close to the 
measured value (Table 1). The computed surface pres- 
sure distribution is plotted along with the experimental 
data in Figure 12. The computed shock position is too 
far downstream and the shock is too strong. But the so- 
lution is in general agreement with other Navier-Stokes 
solvers (Table 1) which may indicate that the problem is 
not as much numerical error as due to shortcomings in 
the turbulence model [1]. The computed upper surface 
skin friction coefficient is compared with experiment in 
Figure 13. The computations predict flow separation at 
the shock at about 63% of the chord and reattachment 
at about 70% of the chord. This separation location is 
in reasonable agreement with other computations, al- 
though there is a large disparity in the location of the 
reattachment point between various calculations. The 
upper surface displacement thickness and momentum 
thickness distributions are plotted in Figure 14. They 
confirm the general trend of reasonably good agreement 
before separation and poorer agreement after it. The 
convergence plot (Figure 15) shows that the solution has 
converged to a steady state in about 100 Work Units. 

4.3 Computational Requirements 

Most of the computations presented in this .paper were 
performed on an IBM 3090/600J. The code was vector- 
ized to increase execution speeds. A typical computa- 
tion took about llOps per Work Unit per grid point 
which is about 20% more than the time required for 
an Euler calculation using the same method. About 
1.5 MB of memory was required for these calculations, 
all of which were done in double precision (64-bit arith- 
metic). 


5 Conclusions 

The multigrid diagonal Alternating Direction Implicit 
scheme developed by Caughey [2] has been extended to 
solve the thin-layer Navier-Stokes equations for com- 
pressible flow. The Baldwin-Lomax algebraic turbu- 
lence model was used. Results including boundary layer 
quantities and velocity profiles for turbulent transonic 
flows past airfoils were presented. They show that for 
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attached flows the computed flowfield data are in good 
agreement with the experimental data. For flows with 
strong shocks and shock-induced separation the agree- 
ment is poor particularly after separation. This can 
most likely be attributed to the equilibrium nature of 
the turbulence model used. The convergence rates are 
about the same for all cases and do not seem to dete- 
riorate significantly with the increasing complexity of 
the flow. The rates also compare favorably with those 
obtained using the explicit Runge-Kutta method. 
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Present 

Results 

Expts. 
[11, 14] 

Computational 
(VTA) [1] 

Case 1 

Cl 

0.238 

0.241 

0.235 - 0.262 


Cd 

0.0084 

0.0079 

0.0074 - 0.0100 

Case 2 

Cl 

0.416 

0.390 

0.300 - 0.476 


Cd 

0.0406 

0.0331 

0.0345 - 0.0446 

Case 3 

a 

2.5 

2.92 

2.3 - 2.8 


Cl 

0.738 

0.743 

0.717 - 0.822 


Cd 

0.0131 

0.0127 

0.0113 - 0.0180 

Case 4 

a 

2.50 

3.19 

2.5 - 2.96 


Cl 

0.750 

0.743 

0.733 - .859 


Cd 

0.0237 

0.0242 

0.0224 - 0.0277 


Table 1: Comparison of force coefficients for the four 
test cases 



NACA 0012 AIRFOIL Grid 192x48 CFL 24.00 
Mach .700 Alpha 1.490 Re 9.000E+08 
MDI Present Work R— K Martinelli & Jameson 


Figure 2: Comparison of convergence rates for Case 1 



NACA 0012 AIRFOIL 

Mach .700 Alpha 1.490 Re 9.000E+08 
Present Work — Harris [1981]°.* 



NACA 0012 AIRFOIL 

Mach .700 Alpha 1.490 Re 9.000E+08 

Work 500.00 CFL 24.00 Grid 192x48 

1 Level — 4 Levels — 


Figure 1: Surface pressure distribution for Case 1 


Figure 3: Global measures of convergence with and 
without multigrid - Case 1 
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NACA 0012 AIRFOIL 

Mach .799 Alpha 2.260 Re 9.000E+06 
Present Work — Harris [l981]o.» King [1987] — 


RAE 2822 AIRFOIL 

Mach .725 Alpha 2.920 Re 8.500E+06 
Present Work — Cook et al.[1979] o, . 


Figure 4: Surface pressure distribution for Case 2 Fi S u ^ : Surface P reSSure distribution for Case 3 with 

a = ‘2.92 



RACA 0012 AIRFOIL 

Mach .799 Alpha 2.260 Re .900E+07 

Real .267E-02 CFL 20.00 

Res2 .582E-10 Grid 182x48 

Work 500.19 Rate .9653 Nraesh 4 



RAE 2822 AIRFOIL 

Mach .725 Alpha 2.500 Re 6.500E+08 
Present Work — Cook et al.[1979] o. * 


Figure 5: Convergence history for Case 2 


Figure 7: Surface pressure distribution for Case 3 with 
a = 2.5 
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RAE 2022 AIRFOIL 

Mach .725 Alpha 2.500 Re 8.500E+06 
Present Work — Cook et al.(l979] o 




Mach .725 Alpha 2.500 Re 0.5OOE+O0 
Present Work — Cook et al.[l979] o . » 


Figure 8: Skin friction distribution for Case 3 Figure 10: Boundary layer velocity profiles at two loca- 

tions for Case 3 



Mach .725 Alpha 2.500 Re 6.5OOE+O0 
Present Work — Cook et al.[ 1979] o 



RAE 2022 AIRFOIL 

Mach .725 Alpha 2.920 Re .650E+07 
Reel .154E-02 CFL 24.00 

Res2 .404E-O9 Grid 192x40 

Work 500.19 Rale .9704 Nmesh 4 




Figure 9: Displacement and momentum thicknesses for 

Case 3 Figure 11: Convergence history for Case 3 
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RAE 2822 AIRFOIL 

Mach .750 Alpha 2.500 Re 8.200E+06 
Present Work — Cook et al.[l079] o. • 




RAE 2B22 AIRFOIL 

Mach .725 Alpha 2.500 Re 6.500E+08 
Present Work — Cook et al.[1979] o 


Figure 12: Surface pressure distribution for Case 4 Figure 14: Displacement and momentum thicknesses for 

Case 4 



RAE 2822 AIRFOIL 

Mach .750 Alpha 2.500 Re 8.200E+08 
Present Work — Cook et al.[l979] o 



RAE 2822 AIRFOIL 

Mach .750 Alpha 2.500 Re .620E+07 
Reel .173E-02 CFL 20.00 

Res2 .840E-08 Grid 192x48 

Work 500.19 Rate .9753 Nraesh 4 


Figure 13: Skin friction distribution for Case 4 


Figure 15: Convergence history for Case 4 



Evaluation of Navier-Stokes Solutions Using the 
Integrated Effect of Numerical Dissipation 

R. R. Varma and D. A. Caughey 


Reprinted from 



Volume 32, Number 2, Pages 294-300 


( $/*JAA * 

A publication of the 

American Institute of Aeronautics and Astronautics, Inc. 
370 L’Enfant Promenade, SW 
Washington, DC 20024-2518 


AIAA Journal 

Vol. 32, No. 2, February 1994 


Evaluation of Navier-Stokes Solutions Using the Integrated 
Effect of Numerical Dissipation 


R. R. Varma* and D. A. Caugheyt 

Cornell University, Ithaca, New York 14853 


A method for evaluating the quality of solutions to the Navier-Stokes equations is developed and illustrated 
with representative examples. In solutions to the Navier-Stokes equations it is important that added numerical 
dissipation does not overwhelm the real viscous dissipation. To verify this, it is necessary to be able to estimate 
quantitatively the effect of numerical dissipation. A method for estimating the integrated effect of numerical dis- 
sipation on solutions to the Navier-Stokes equations is developed in this paper. The method is based on integra- 
tion of the momentum equations, and the computation of corrections due to numerical dissipation to the drag 
integral. These corrections can then be considered as estimates of the error due to dissipation. Solutions to the 
Navier-Stokes equations for laminar and turbulent flows over airfoils are used to illustrate the method. The 
errors due to numerical dissipation are compared with the total numerical errors in the solutions. The effect of 
Mach number scaling of the numerical dissipation terms is discussed. 


I. Introduction 

T HE evaluation of the quality of any numerical solution of the 
Navier-Stokes equations and the validation of the computer 
code that yielded the solution necessarily require an estimation of 
the errors in the solution. As Holst 1 has pointed out, these errors 
fall under two broad categories — physical modeling errors and nu- 
merical errors. The physical modeling errors include, among oth- 
ers, those arising from the approximations involved in the Navier- 
Stokes equations themselves, or their thin-layer approximation, as 
well as those introduced by any model for the effects of turbu- 
lence. The numerical errors include those due to the basic discreti- 
zation scheme, including any implicit or explicit numerical dissi- 
pation, and are dependent upon the fineness and distribution of the 
grid. Physical modeling errors can be quantified only by compari- 
son with the results of experiments or with the results of direct nu- 
merical simulations in which the corresponding approximations 
are not made. Before these comparisons can be meaningful, how- 
ever, it is important to understand the level of numerical error, and 
this can be done without recourse to comparison with experiments. 
It is with these numerical errors and, in particular, with the effects 
of numerical dissipation, that the present article is concerned. 

The calculation of fluxes in several widely used finite volume 
schemes used to solve the Euler and Navier-Stokes equations can 
be shown to be equivalent to central differencing. Such schemes, 
applied to the Euler equations, do not contain any inherent dissipa- 
tion. To prevent odd-even point decoupling and oscillations near 
shock waves or stagnation points, numerical dissipation terms 
must be added when solving the Euler equations. The Navier- 
Stokes equations, on the other hand, possess dissipative properties 
due to the presence of the viscous terms, but the physical dissipa- 
tion provided by these terms in regions far away from the surface 
is usually small, and the addition of numerical dissipation terms is 
still necessary to ensure the stability and robustness of the 
schemes. While the added dissipation terms must be large enough 
for this purpose, they must also be small enough not to overwhelm 


Presented as Paper 93-0539 at the AIAA 31st Aerospace Sciences Meet- 
ing, Reno, NV, Jan. 1 1-14, 1993; received April 5, 1993; revision received 
July 22, 1993; accepted for publication Aug. 3, 1993. Copyright © 1993 
by the American Institute of Aeronautics and Astronautics, Inc. All rights 
reserved. 

Graduate Student; currently Post-Doctoral Fellow, Center for Compos- 
ite Materials, University of Delaware, Newark, DE 197160. Student Mem- 
ber AIAA. 

^Professor and Director, Sibley School of Mechanical and Aerospace 
Engineering. Associate Fellow AIAA. 


the effects of the real viscous dissipation in regions where the lat- 
ter is significant. 

Most previous attempts at studying the effects of numerical dis- 
sipation have been based primarily on qualitative comparisons of 
computed solutions and experiments. For the Euler equations, 
Caughey and Turkel 2 looked at the effects upon solution accuracy 
of various forms of the dissipative terms and the smoothness of the 
mesh. They used nonphysical behavior of the solution, such as os- 
cillations in the surface pressure distribution near the airfoil trail- 
ing edge, to study the effects of numerical dissipation. A similar 
approach was followed by Swanson and Turkel 3 for both the Euler 
and Navier-Stokes equations. They analyzed various ways of re- 
ducing artificial dissipation in central difference schemes for the 
solution of these equations. Their efforts were also directed at ob- 
taining better qualitative behavior of the solution in critical regions 
of the flow and better agreement with experimental data. While 
this approach provides some useful insight, there is clearly a need 
to develop a method that provides quantitative estimates of the ef- 
fect of numerical dissipation on the solution. 

In this paper, we first develop a method for estimating quantita- 
tively the integrated effect of numerical dissipation on solutions to 
the Navier-Stokes equations. Using this method we then evaluate 
the quality of solutions to the thin-layer Navier-Stokes equations 
for two-dimensional transonic flows over airfoils obtained using 
the multigrid diagonal implicit method of Varma and Caughey. 4 

II. Analysis 

The governing differential equations considered here are the 
thin-layer Navier-Stokes equations in two dimensions. These equa- 
tions are transformed from a Cartesian coordinate system (x, y) 
into a generalized coordinate system (£, T|). Near the airfoil sur- 
face, the ^-coordinate is approximately parallel to, and the r|-coor- 
dinate is approximately normal to, the body. The airfoil surface it- 
self is a ^-coordinate line. The transformed equations are modified 
by artificial dissipation terms. The form of dissipation used in 
these calculations is based on the adaptive blend of second and 
fourth differences described by Jameson et al. 5 and modified by 
Caughey. 6 The system of equations can be written in the fully con- 
servative form, 

dW dF dG 9G'v dD K w dD n w _ n ^ 

dt + 35 + an an ^ 

where 

W = /i{p, pw, pv, e) T (2) 
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is the vector of conserved dependent variables. The transformed 
inviscid flux vectors are 

F = h{pU, p Uu + \ x p, p Uv + % y p, ( e + p)U) T (3) 

G = h{ pK p Vu + q^p, pVv + r\ y p, ( e + p)V) T (4) 

Here h is the determinant of the Jacobian of the transformation, p 
is the fluid density, u and v are the respective velocity components 
in the x and y directions, U and V are the contravariant components 
of velocity in the q and T) directions, respectively, and e is the total 
energy per unit volume. The thin-layer approximation to the trans- 
formed viscous flux vector is 

G' v = h{ 0, T|X, + hXy + ^Xy 

+ ^'yy,A x {UO' xx +VO' xy -q' x ) 


+ T\ y (.uO xy +vo' yy -q' y )}'‘ 


(5) 


where a xx , G xy , and a' yy are the thin-layer contributions to the 
Cartesian viscous stresses, and q' x and q y are the corresponding 
contributions to the heat fluxes. The dissipative fluxes across cell 
faces in the £, and T| directions are D^W and where the differ- 
ential operators D ^ and D n have the form 


D 


5 


<2) _9 3_ (4) 3 2 

3$ 


and D n = e 


( 2 ) _ 3 _ 
1 3q 


3 (4) d 2 


( 6 ) 


The coefficients of dissipation, e (2) and e (4) , are defined following 
Caughey. 6 


A. Integration of the Momentum Equations 

The analysis that follows is based on a generalized derivation of 
the Momentum Theorem. The numerical implementation proce- 
dure will be discussed in Sec. II.D.2. 

Consider a fixed area A bounded by a closed curve C within the 
computational domain as shown in Fig. 1 . The curve C is chosen 
such that it includes the body surface. Integrating the governing 
equations [Eq.(l)], which are satisfied at every interior point, over 
the area A gives 


ff fdff 3F 3G 3Gy 
JJA 3 1 3£, 3r| 3 t| 


3 D^w dD^w' 

3$ 3^T ' 


di; dr) = 0 



Fig. 1 Contour for integration for drag: closed curve C enclosing 
area A. 


Using Green’s theorem, the area integrals over A are converted to 
line integrals along C. This gives 


W d^dq+£(F dq-Gd£) +£g; d£ 


+ | c (D I ,wd^-D^dii) = 0 (7) 

At steady state, the first term in Eq. (7) is identically zero, and the 
integral equation becomes 


o [F dt|-G d^ + G; d^ + D h- d^-D t w dq] =0 

Jr s 


( 8 ) 


The continuity, the two momenta, and the energy equations have 
the same integral form. Our interest is restricted to the calculation 
of drag, and so only the two momentum equations are considered. 
Before we go on, however, a clarification regarding the notation is 
required here. As defined by Eqs. (2-5) the vectors TV, F, G, and 
G' each have four components corresponding to the four equa- 
tions. For the rest of this paper, the same vector notation will be 
used although we consider only the two components correspond- 
ing to the momentum equations. 

The curve C consists of segments along the body surface, the 
outer contour, and the branch cut. Continuity of the solution across 
the cut ensures that all fluxes on one side of the cut exactly equal 
the corresponding fluxes on the other side. Since the cut is tra- 
versed twice — once in each direction — the net integral of fluxes 
along the cut is zero. Eq. (8) then reduces to a generalized form of 
the momentum integral equation: 


[F dq-G d£ + G' A^ + D w A^-D y w dq] 

Body 

+ f [F dq-G d^ + G; A^aD.w A^-D.w dq] = 0 

J Outer ^ 

(9) 


B. Body Surface Integral 

The integral over the body in Eq. (9) is further simplified when 
the mesh on which the solution is obtained is such that the airfoil is 
a line of constant q as is the case in our calculations. Then, the in- 
tegral over the body surface reduces to 


(-Gd£ + G;d^)+ {D^wAZ,) (10) 

* Body " Body 

Using Eqs. (4) and (5), the first term of the above expression can 
be expanded as 



(-Gd^ + c; di;) 


l Boiy [ p d y- a ** d y +a *> dx] ' 

jB0dy [ - pdX “Xdj'+O;dx] / 


( 11 ), 


The physical forces that act on the body, and determine the lift and 
drag, are caused by pressure and viscous stresses. Expressions for 
7 X and 3y, the x and y components respectively of the force 7 on the 
body, can be obtained by integrating the pressure and the viscous 
stresses. These force components are given by 

= J B ody [p d y-°« d y +X d *] 

= j B cdy [ -^ dj: - 0 ^ d ^ +o ;y d ^] 
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Therefore, from Eq. (1 1), we have 



(-Gd^c; d§) 


( 12 ) 


i.e., the integral over the body of the inviscid (pressure) and vis- 
cous fluxes acting on the surface gives the net force on the body. 
Note that the integral over the body surface [Eq. (10)] contains an 
additional term due to the dissipative fluxes; the significance of 
this term will be discussed in the following section. 

C. Corrected Outer Integral 

The momentum integral equation [Eq. (9)] thus yields an ex- 
pression for the force 5 on the body in terms of integrals of the in- 
viscid and viscous fluxes along an outer contour modified by inte- 
grals of the numerical dissipation fluxes. From Eqs. (9), (10), and 
(12), we have 

ff=f (G dJ; - F dr|- G' di;) 

• Outer 

+ [ (D,w dr\-D w d£) - f D w d£ (13) 

■> Outer 1 J Body 


This is termed the corrected outer integral. It is an equivalent ex- 
pression for the forces on the body, and consists of contributions 
from three sources: 1) inviscid contributions from the net pressure 
forces on the outer contour and the net inviscid momentum flux 
across it, 

f (G dT|) 

J Outer 

2) contributions from the net viscous stresses acting on the outer 
contour, 


(-c; d$) 

J Outer 

and 3) contributions due to the numerical dissipation which arise 
from two sources: dissipation fluxes across the outer contour, 

[ (D^w dt| — Dw dij) 

J Outer 

and dissipation fluxes across the body surface, 


corrections due to dissipation, relative to the inviscid contributions 
in particular, are expected to be negligible. It is indeed so, as will 
be shown later. The drag coefficient C d , on the other hand, is 
known to be sensitive to small changes in the solution. So we 
choose to focus our attention on the calculation of the total drag 
coefficient, C d (total), on the body. 

The two contributions to the body surface integral for drag 
[from Eq. (12)] are denoted as C d (p) due to the pressure, and 
C d (/) due to the shear stresses on the surface (skin friction). The 
total drag coefficient is given by 

C d (total) = C d (p) + C d (/) (14) 

For attached flow, we expect the two contributions to be compara- 
ble in magnitude. For flows with significant separation, the pres- 
sure drag is expected to dominate. 

The three contributions to the corrected outer integral for drag 
[from Eq. (13)] are denoted as C rf (inviscid) due to the pressure and 
convective terms, C d (viscous) due to the viscous stresses, and 
C d (diss) due to the dissipative fluxes. The total drag coefficient is 
given by 

C d (total) = Q(inviscid) + C d (viscous) + Q(diss) (15) 

As described in Sec. II.C, the numerical dissipation flux contribu- 
tions to the drag coefficient can be separated further into compo- 
nents corresponding to the outer and body-surface contours: 

C d (diss) = C d (diss-outer) + C d (diss-body) (16) 

Various contours, at different distances from the surface, are cho- 
sen for calculating these contributions to drag. If the body surface 
is chosen as the outer contour, then we have C d (diss-outer) 
= - C d (diss-body), which gives us the body surface integral as ex- 
pected. For a contour close to the surface, C d (viscous) is expected 
to be significant. For contours at sufficiently large distances from 
the surface, C d (inviscid) is expected to dominate. 

For a steady-state solution, the C d (total) computed on the sur- 
face using the body surface integral [Eq. (14)] must equal exactly 
the C d (total) computed along any outer contour using the cor- 
rected outer integral [Eq. (15)], since each of these expressions for 
drag is derived from expressions for T [Eqs. (12) and (13)] that are 
exactly equivalent. The addition of the corrections due to dissipa- 
tion is necessary for this to be true. 

1 . Error Due to Dissipation 

Equating the two expressions for C d (total), we get 



C d (total) = [C d (p) + C d (/)] Bo dy 

= [C d (inviscid) + C d (viscous)] 0uter + C rf (diss) 


If the viscous stresses are negligible on the outer contour and the 
dissipation terms are absent, then Eq. (13) reduces to the usual 
form of the momentum integral equation consisting of only the in- 
tegral of the inviscid fluxes. 

The dissipation contributions from the integral over the body 
can be interpreted as due to artificial sources and sinks of momen- 
tum created on the body surface, which lead to an artificial mo- 
mentum surplus or deficit in the integral along an outer contour. 
This in turn is reflected in the calculation of forces on the body 
from such an outer integral. The numerical dissipation contribu- 
tions listed above may together be considered as corrections due to 
dissipation to the outer integral for the calculation of the forces < S X 
and ‘.f y on the body, and therefore as corrections also in the calcula- 
tion of lift and drag. 


It is clear from this formulation that C d (diss) values can be consid- 
ered as errors in the calculation of drag. From among values of 
Q(diss) for various possible contours, we choose one which re- 
flects best the error due to numerical dissipation in the solution. It 
is possible to set C d (diss-body) to zero through a particular choice 
of boundary conditions as we shall see in Sec. II.E. Therefore, the 
value of C d (diss-body) is not necessarily representative of the total 
error. Values for C d (diss-outer) can be calculated for various con- 
tours, each value representing the effect of dissipation along that 
contour. If we want to control the amount of dissipation in the so- 
lution, i.e., keep it below a certain level, then the quantity that we 
should be most concerned about is the maximum value of C d (diss- 
outer) along any contour. Therefore, Max [C d (diss-outer)] is chosen 
to characterize the error due to dissipation. 


D. Quantitative Estimates of Dissipation 
The added numerical dissipation terms are formally third order 
in the mesh spacing, and are therefore expected to have very little 
effect on the solution if the mesh is sufficiently fine. In the calcula- 
tion of the lift coefficient C, using the corrected outer integral, the 


2. Numerical Implementation 

A numerical approximation to the time-dependent equation, 

n f dW . dF . 3G 3G; dD % w jk _ ^ 

A dt 3£, 9r) dx\ 3$ 3q J d11 " ° 
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may be satisfied exactly in each cell at each time step according to 
the discretization procedure. However, in this paper, we will re- 
strict our attention to evaluating the quality of steady-state solu- 
tions only. In particular, we will focus on the solutions obtained 
using the multigrid method described by Varma and Caughey. 4 
The criterion for convergence to the steady state is that the residu- 
als of the equations be reduced to values below a certain level. If 
the numerical solution is converged such that the residuals are 
down to round-off levels, then the steady state equation [(Eq. 8)] 


(£ [F dq-G d^ + G’ v d£ + £>„w d^-D^w dr>] = 0 (17) 


is satisfied to machine precision in each cell. This equation is also 
satisfied for a curve C enclosing several cells, such as shown in 
Fig. 1, if the numerical scheme is conservative in transport and 
therefore globally conservative, and the fluxes across the curve are 
calculated in a manner which is consistent with the way they are 
evaluated in the residual calculation in the iterative solver. The cal- 
culation of these fluxes is greatly simplified if the curve C is cho- 
sen along grid lines. In this case, the total flux is taken to be the 
sum of the fluxes, as computed by the iterative flow solver, across 
the individual cell faces that make up the curve. It follows that the 
value of C d (total) evaluated on the body surface using the body 
surface integral [Eq. (14)] must agree with the value calculated 
along any outer contour using the corrected outer integral [Eq. 
(15)] to within the degree of convergence. 

E. Dissipation Schemes 

The analytical form of the adaptive blend of second and fourth 
difference dissipation is given by Eq. (6). Numerical implementa- 
tion of this form of dissipation requires the specification of bound- 
ary conditions on both the second and fourth difference terms. 
Along the cut the conditions are periodic, and in the far field the 
gradients are assumed to be negligible. However, on the body sur- 
face, boundary conditions for the normal dissipative flux, 




(2)9h; 
^ dp 


3 (4)3jV 

3t| 2 


cannot be specified uniquely based on simple physical reasoning. 
Several different implementations of the boundary conditions on 
the normal dissipation fluxes are discussed by Varma and 
Caughey. 7 Those that lead to nonzero numerical dissipation fluxes 
on the body surface yield poor quality solutions near the surface, 
particularly on the coarser grids. In this paper, we will consider 
two implementations: 

1) Scheme A: The numerical dissipation fluxes — both the first 
and third difference fluxes — on the solid surface are set equal to 
zero. This leads to 


(AiH , )/, 1 ' / 2=0 

Therefore, only the real viscous and inviscid fluxes are nonzero on 
the body surface. 

2) Scheme B: In regions of large gradients, such as the boundary 
layer and the wake, the numerical dissipation fluxes are expected 


Table 1 Contributions to drag coefficient due to inviscid fluxes, 
viscous fluxes, dissipation fluxes along the outer contour, and 
dissipation fluxes across the body surface, and total drag coefficient for 
four choices of contours: laminar case, 256 X 72 grid, scheme A 


Distance 
from body 
(chords) 

c d 

(inviscid) 

Q 

(viscous) 

Cd 

(diss-outer) 

Cd 

(diss-body) 

Q 

(total) 

0 [body] 

232 

352 

— 

— 

584 

2.7 X KT 1 

244 

348 

-8 

0 

584 

5.2 X 10- 3 

354 

199 

31 a 

0 

584 

1.0 

584 

= 0 

= 0 

0 

584 


“Denotes maximum value of Q(diss-outer) 


Table 2 Comparison of numerical error and dissipation error 
estimates — turbulent flow cases, scheme A 



Grid size 

c d (f) 

C d (p) 

Error in 
C d {p) 

Diss error 

Turbulent case 1 

128 X 36 (4/i) 

65.9 

52.7 

26.7 

39.6 


256 X 72(2A) 

63.5 

31.7 

5.7 

15.7 


512 X 144 (A) 

62.2 

27.4 

1.4 

4.5 


A-> 0 

61.8 

26.0 

— 

— 

Turbulent case 2 

128 X 36 (4/i) 

66.1 

216.0 

21.9 

40.9 


256 X 72 (2/0 

62.8 

199.7 

5.6 

12.9 


512 X 144(A) 

61.9 

195.4 

1.3 

3.3 


A ->0 

61.6 

194.1 

— 

— 


to be large. But these are the very regions where the viscous effects 
are also important. Therefore, in viscous calculations it is common 
to scale the numerical dissipation in the normal direction by multi- 
plying the fluxes by some function of the local Mach number. This 
technique is expected to reduce the effect of numerical dissipation 
near the surface where the local Mach numbers are low without af- 
fecting it in the rest of the flowfield. 8 Here the p-direction numeri- 
cal dissipation fluxes across cell faces parallel to the ^-direction 
are scaled by the local Mach number normalized by the freestream 
Mach number, i.e., 


D n w =f(M) X 


/ 


(2)3lV 

p 3T1 


3 ( 4)3 2 w 
^ 3p 2 


'i 


J 


where /(Af) = M/M 


F. Total Numerical Error 

Estimates of the total numerical error in the solution can be ob- 
tained using Richardson extrapolation. 9 Given an initial grid, 
coarser grids are obtained successively by removing every other 
line in each of the two coordinate directions. Iteratively converged 
solutions are first computed on the finest grid (denoted by the sub- 
script h), and then on two successively coarser grids (denoted by 
subscripts 2h and 4 h). The coefficient of total drag C d and contri- 
butions due to skin friction and pressure forces are computed on 
each of these grid levels. Asymptotic values (denoted by subscript 
0), i.e., values in the limit of zero mesh spacing, are estimated 
based on the order of convergence. When the convergence is sec- 
ond order, as expected for the computational scheme used here, the 
estimate of the asymptotic value is (C d ) 0 = l A [4(Q)a - (Q)^]. 
Convergence studies for the drag coefficient to be presented in the 
next section verify this second-order accuracy. From the asymp- 
totic values, the total numerical errors in the drag coefficient for 
each grid level can be calculated. 


III. Results of Solution Evaluations 

Representative laminar and turbulent flow solutions to the 
Navier-Stokes equations for flows past airfoils are evaluated using 
the method described in the preceding section. In the two lifting 
cases analyzed here, the effect of the numerical dissipation on the 
lift coefficient is found to be negligible — less than 0.1% on all 
grids. However, the effect of the dissipation on the drag coefficient 
is not negligible. And so, as mentioned earlier, we will concentrate 
on the precise calculation of the drag coefficient. 

For each calculation, the coefficient of drag is computed using 
the body surface integral and corrected outer integrals. From a 
breakdown of contributions to the computed drag along various 
contours, the error due to numerical dissipation is estimated quan- 
titatively. The asymptotic behavior of this error is studied as the 
grid is refined. The estimated dissipation error is compared with 
the total numerical error, which is obtained using Richardson ex- 
trapolation. This procedure is repeated for the two numerical dissi- 
pation schemes described in Sec. H.E, and the merits of the 
schemes are discussed. The usefulness of this method in evaluating 
the quality of flow solutions is thus demonstrated. 
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Table 3 Error due to numerical dissipation — scheme A 



Grid size 

Dissipation error 

C d (total) 

Laminar case 

128 X 36 (4ft) 

121.9 

630.8 


256 X 72 (2 ft) 

30.8 

584.3 


512 X 144 (ft) 

4.6 

569.8 

Turbulent case 1 

128 X 36 (4ft) 

39.6 

118.6 


256 X 72 (2 ft) 

15.7 

95.2 


512 X 144 (ft) 

4.5 

89.5 

Turbulent case 2 

128 X 36 (4ft) 

40.9 

282.0 


256 X 72 (2 ft) 

12.9 

261.1 


512 X 144 (ft) 

3.3 

256.5 


All values for the coefficient of drag in the tables presented here 
are expressed in terms of drag counts; one drag count equals 
0.0001. 

A. Representative Solutions 

Three different flow solutions — one laminar case and two turbu- 
lent cases — are evaluated. 

We first evaluate the laminar case — NACA 0012 airfoil at = 

0.5, a = 0 deg, and Re„ = 5000. This is a symmetric airfoil at zero 
angle of attack; consequently the lift is zero. The coefficient of 
drag is therefore computed from the integral of the x-momentum 
equation alone 

We then evaluate turbulent case 1 — NACA 0012 airfoil at = 

0.7, a = 1.49 deg, and Re „ = 9 X 10 6 . The solutions computed for 
this case indicate that the flow is attached and is only slightly su- 
personic in a small pocket near the leading edge on the upper sur- 
face. Therefore the pressure drag is relatively small, and the skin- 
friction drag is a significant portion of the total drag. 

Finally, we evaluate turbulent case 2 — RAE 2822 airfoil at 
= 0.75, a = 2.5 deg, and Re^ = 6.2 X 10 6 . The computed solutions 
show a strong shock above the airfoil at about 65% of the chord 
producing wave drag and causing a significant amount of flow 
separation. The pressure drag component, and therefore the total 
drag, is substantially larger than in the NACA 0012 case. 

Well-converged solutions were computed using the MDI 
algorithm 4 with Schemes A and B for the numerical dissipation 
(Sec. II.E). The finest grids (512 X 144) used in the series of calcu- 
lations presented here were generated using the GRAPE pro- 
gram. 10 For the laminar flow calculations, the distance to the first 
grid point normal to the airfoil surface was about 10 4 chord; while 
for the turbulent flow calculations, it was about 10‘ 6 chord. For all 
three grids, 62.5% of the mesh cells in the wrap-around direction 
were on the airfoil surface. From each fine grid two coarser grids 
(256 X 72 and 128 X 36) were obtained sequentially by removing 
every other line in each of the two mesh directions. The coarsest 
grid was fine enough to resolve most features in the boundary 
layer. 

B. Dissipation Error Estimates 

The estimation of dissipation errors is demonstrated using 
Scheme A for the laminar flow solution obtained on the 256 X 72 
grid. (See Table 1.) The body surface integral is used to obtain the 
pressure and skin-friction drag components on the airfoil surface. 
The pressure drag [C d (p) = 0.0232] accounts for about 40% of the 
total drag; the skin-friction component [C d ( / ) = 0.0352] accounts 
for the remaining 60%. The contributions to the total drag com- 
puted using the corrected outer integral along two outer con- 
tours — one close to the surface and the other about a chord away — 
are considered next. The breakdown of C d into its various compo- 
nents is as expected. Along a contour corresponding to the first 
gridline off the surface (2.7 X 10 -4 chord), the breakdown of C d 
between the inviscid and viscous components is still roughly 
40:60. But there is a correction due to dissipation [Q(diss) = 
—0.0008] which is about 1.4% of the total drag. Along a contour 
which is a chord away from the surface, the inviscid flux integral 
[C d (inv) = 0.0584] essentially gives the total drag. Because we set 
the numerical dissipation fluxes on the surface to zero, the correc- 


tion due to dissipation C d (diss) is solely from the outer integral of 
the dissipation fluxes. The maximum value of C d (diss-outer), 
which occurs along a contour about 0.005 chord from the airfoil 
surface, is about 0.0031. This value amounts to about 5.3% of the 
total drag coefficient, and provides an estimate of the error due to 
dissipation in the solution. As expected for an iteratively con- 
verged solution, the total drag coefficient has the same value of 
0.0584 for all contours, including the body surface. In this manner, 
dissipation error estimates — maximum values of C d (diss-outer) — 
are easily obtained for all solutions. 

The asymptotic behavior of the dissipation errors as the grid is 
refined is studied next. Table 2 shows the dissipation errors and the 
total drag coefficients on the three grid levels for the three cases. 
As expected, the dissipation errors are relatively large (14—33% 
here) on the coarsest grids and small (only up to 5% here) on the 
finest grids. For the laminar case, the dissipation error goes down 
by a factor of about four from the coarsest grid to the next finer 
grid, and by a factor of nearly seven from the latter grid to the fin- 
est grid. For the two turbulent-flow cases, the dissipation errors on 
the finer grids reduce only by a factor of about four. The dissipa- 
tion terms introduce third-order errors for a uniform grid. The re- 
duced accuracy seen here may be due to the stretching of the grid, 
particularly in the boundary layer. 

C. Comparison of Total and Dissipation Errors 

The total numerical errors in the solutions can be estimated 
using Richardson extrapolation. To obtain these estimates, the val- 
ues of the pressure drag C d (p) and the skin-friction drag C d ( f ) on 
the three grid levels are considered. The order of convergence of 
the drag components is determined, and the errors estimated from 
the asymptotic values. Comparisons for the two turbulent-flow 
cases are presented here. 

The solutions for the turbulent case 1 are analyzed first. The 
skin-friction drag values, as seen in Table 3, are quite close to the 
asymptotic value of 0.00618 even on the coarse meshes. To esti- 
mate the total numerical errors in the solutions, we will consider 
the C d (p) values. The convergence of C d (p) in the limit as the 
square of the mesh spacing tends to zero is shown in Fig. 2. The 
line indicates the linear least-squares fit for the data points corre- 



Fig. 2 Plot showing the pressure drag components on meshes with 
spacings A, 2A, and 4ft; and the linear least squares fit through them — 
turbulent case 1, Scheme A (Table 3). 
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pation errors are very small. The drag component values for both 
schemes are essentially second-order accurate. However, the val- 
ues with Mach number scaling are closer to the asymptotic values. 

The errors due to numerical dissipation in the total drag coeffi- 
cients for the two schemes are compared next. The results for the 
laminar flow case are shown in Fig. 5. At the finest grid level, the 
dissipation error in Scheme B is only slightly less than in the other 
schemes, but on the coarser meshes the effect of the Mach number 
scaling of the dissipation terms appears more significant. The re- 
sults for one of the turbulent flow cases (turbulent case 2) are 
shown in Fig. 6. The errors range from about 15% of the total drag 
on the coarsest grid to about 1 % on the finest grid. They appear to 
approach third-order accuracy on the finer grids. For both schemes 
the dissipation errors are of comparable magnitude, but a reduction 
in the dissipation error with Mach number scaling is observed. 


E. Summary of Results 

For three cases involving both laminar and turbulent flows, the 
errors due to numerical dissipation are estimated and compared 
with the total numerical error. The total numerical errors are based 
on the errors in the drag components, while the dissipation errors 


1 90 . 0 i — i — i — i — i — ‘ — ‘ — i — ■ — i — i — < — i — i — i — i — 1 
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Fig. 3 Plot showing the pressure drag components on meshes with 
spacings ft, 2 ft, and 4A; and the linear least squares fit through them — 
turbulent case 2, Scheme A (Table 3). 


sponding to the three grids. The convergence is close to second- 
order accurate. The asymptotic value for the pressure drag — C d {p) 
= 0.0026 — is calculated. The errors in C d ( p) are computed and 
compared with the errors due to dissipation in Table 3. The two er- 
rors are of comparable magnitude on all three grids, although the 
dissipation errors are consistently larger than the total numerical 
error estimates. 

The turbulent case 2 solutions are considered next. The values 
of C d if) are fairly accurate even on the coarse meshes as seen in 
Table 3; an asymptotic value of about 0.0062 is estimated. The 
computed pressure drag, which is affected by the strength of the 
shock and the extent of flow separation, depends more strongly on 
the resolution of the grid. The values of C d {p) on the three grids 
are shown in Table 3, and are plotted against the square of the 
mesh spacing in Fig. 3. The convergence is again very nearly sec- 
ond-order accurate; an asymptotic value for C d (p) equal to 0.0194 
is obtained. Based on this asymptotic value, errors in C d (p), which 
can also be considered as estimates of the total numerical error in 
the solutions, are computed. Comparison of these errors with the 
errors due to dissipation shows again that the two are of compara- 
ble magnitude but the dissipation errors are larger than the esti- 
mated numerical errors on all grid levels. 

D. Mach Number Scaling of Dissipation Terms 

Solutions for the three representative cases computed with and 
without Mach number scaling of the dissipation terms are analyzed 
and compared in this section. In Scheme A, the dissipation fluxes 
on the surface are set to zero and therefore C d (diss-body) is identi- 
cally zero. In Scheme B, the dissipation fluxes in the q -direction 
were scaled by the local Mach number normalized by the free- 
stream Mach number. The local Mach number on the airfoil sur- 
face is zero because of the no-slip boundary condition, so the nu- 
merical dissipation fluxes on the surface are again identically zero. 

The pressure and skin-friction drag components for the laminar 
flow case on the three grids are compared for Schemes A and B in 
Fig. 4. For each of the drag components, the asymptotic values as 
the mesh spacing tends to zero are nearly the same for both 
schemes [C d (p) = 0.0232-0.0233; C d (f) = 0.0332-0.0334], On 
the finest grid, the two solutions are very similar because the dissi- 
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Fig. 4 Convergence with mesh spacing of the pressure and skin-fric- 
tion drag coefficients for Schemes A and B — laminar case. 
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Fig. 5 Reduction in dissipation error with mesh spacing for Schemes 
A and B — laminar case. 
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Turbulent Flow Case 2 
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Fig. 6 Reduction in dissipation error with mesh spacing for Schemes 
A and B — turbulent case 2. 


are based on the maximum value of C d (diss-outer). In the cases 
considered here, the two errors are of the same order of magnitude, 
although the dissipation errors are in general larger than the total 
numerical errors. The total numerical errors scale very nearly with 
the second power of the grid spacing, but the dissipation errors are 
consistently between second and third order, even on rather fine 
grids. Both schemes for numerical dissipation produce errors of 
comparable magnitude in all cases. However, the Mach number 
scaling of the dissipation terms reduces the dissipation errors in 
most cases. 


IV. Conclusions 

A method for estimating the effect of numerical dissipation on 
solutions to the Navier-Stokes equations is developed. The analy- 
sis follows a generalized derivation of the momentum integral 
equations. An exact expression for the lift and drag forces on a 
body is obtained in terms of a corrected outer integral. This inte- 
gral contains corrections due to dissipation in addition to contribu- 
tions from inviscid and viscous fluxes along the outer contour. The 
results presented here demonstrate that the corrections to drag due 
to the added numerical dissipation can be used to estimate the ef- 
fect of this dissipation on the solution. The total numerical error 
can be estimated using Richardson extrapolation from the values 
of the pressure and skin-friction drag on the surface. These two 


error estimates together provide a means of judging the quality of 
computed solutions. The dissipation errors and the total numerical 
errors are of comparable magnitude for the cases considered here. 
While the total numerical errors are second order in the mesh spac- 
ing as expected, the dissipation errors are between second and 
third order. Mach number scaling of the normal numerical dissipa- 
tion fluxes reduces the dissipation errors in most cases. 

As Navier-Stokes computations for high Reynolds number 
flows become more routine, the need for better techniques for 
evaluating the quality of the solutions becomes more important. 
The dissipation error estimation method described here is a useful 
tool for this purpose. It can also be used to guide the development 
of new numerical dissipation models. 
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MULTIGRID DIAGONAL IMPLICIT SOLUTIONS 
FOR COMPRESSIBLE TURBULENT FLOWS 
AND THEIR EVALUATION 


Rama Rajaraja Varma, Ph.D. 

Cornell University 1993 

A numerical scheme to solve the two-dimensional Navier-Stokes equations 
is developed, and applied to several compressible turbulent flows over airfoils. 
A method for evaluating the quality of these solutions is then developed and 
illustrated with representative examples. 

The distinguishing features of the numerical scheme are its implicitness 
for improving stability, the diagonalization of the matrices in the implicit op- 
erator for computational efficiency, and the implementation within a multi- 
grid procedure for convergence acceleration. A finite volume approximation 
is used for spatial discretization of the governing equations to handle compli- 
cated geometries. Artificial dissipation is added in the form of an adaptive 
blend of second and fourth differences of the solution to maintain robustness 
and stability. The viscous terms are treated explicitly to maintain the diag- 
onal form. Results of simulations of viscous transonic flows past airfoils are 
presented. The computed flow field quantities are compared with those from 
other computations and experiments to confirm the accuracy of the method. 
Comparisons of convergence rates are made to demonstrate the efficiency of 
the method. 




In solutions to the Navier-Stokes equations it is important that the added 
numerical dissipation does not overwhelm the real viscous dissipation. In 
order to verify this, it is necessary to be able to estimate quantitatively 
the effect of numerical dissipation. A method for estimating the integrated 
effect of numerical dissipation on solutions to the Navier-Stokes equations is 
developed in this dissertation. The method is based on integration of the 
momentum equations, and the computation of corrections due to numerical 
dissipation to the drag integral. These corrections can then be considered 
as estimates of the error due to dissipation. Solutions to the Navier-Stokes 
equations for laminar and turbulent flows over airfoils are used to illustrate 
the method. The errors due to numerical dissipation are compared with the 
total numerical errors in the solutions. The effects of different boundary 
conditions on the numerical dissipation are evaluated, providing means of 
judging the quality of the solutions. 
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Abstract — Recent advances in the development of the diagonalized alternating direction implicit multigrid 
method for compressible aerodynamic problems are reviewed. These include the extension of the method 
originally developed for the Euler equations to include viscous effects, the computation of turbulent flows 
and the implementation on parallel computers of the scheme on multiblock grids. 


1. INTRODUCTION 

The multigrid method was first applied successfully to solve the Euler equations of inviscid, 
compressible flow by Jameson [1], using the time-marching scheme of Jameson et al. [2]. 
While this marching method is usually referred to as an “explicit, multistage (or Runge-Kutta) 
scheme”, in order to be effective as an iterative technique it is usually necessary to enhance 
the smoothing properties of the scheme using enthalpy damping (for the Euler equations) and 
implicit residual smoothing for use on highly stretched grids. The fact that the implicit residual 
smoothing is implemented for multidimensional problems in an alternating direction implicit (ADI) 
fashion suggests that ADI schemes themselves would also be effective smoothers for use with 
multigrid. 

Such a method has been developed by Jameson and Yoon [3]. In order for the implicit method 
to be an effective smoothing algorithm when used in conjunction with the multigrid algorithm, 
it is important to include an accurate representation of the numerical dissipation terms. These 
usually include fourth-differences of the solution in order to maintain high accuracy, and their 
inclusion in the implicit operator requires the solution of pentadiagonal systems of equations for 
each one-dimensional factor. To avoid the high cost of solving block pentadiagonal systems, these 
equations can first be diagonalized at each point using a local similarity transformation, an idea 
first introduced by Chaussee and Pulliam [4], This procedure decouples the equations so that the 
solution only of scalar pentadiagonal equations is required for each factor. The resulting method 
has good high-wavenumber damping, so it is a good smoothing algorithm for use in conjunction 
with the multigrid method, yet it remains computationally efficient because of the need to solve 
only scalar systems of equations. 

An efficient diagonalized ADI multigrid method of this sort has been developed by Caughey, 
who applied the scheme to compute transonic flows past airfoils [5]; the method has also been used 
to compute two-dimensional supersonic inlet flow fields by Iyer and Caughey [6]. The algorithm 
has been extended to three-dimensional flows by Yadlin and Caughey [7], who computed flows past 
swept wings. A multiblock grid version of the two-dimensional implementation of the algorithm 
has been implemented on a shared-memory parallel computer by Yadlin and Caughey [8]. 

More recently, the method has been extended to treat the flow of real fluids by 
incorporating the viscous terms of the Reynolds-averaged Navier-Stokes equations, or their 
thin-layer approximation, including turbulent flows using a simple algebraic model. Additional 
work on the multiprocessing implementation of the algorithm on multiblock grids has lead to a 
more general multigrid strategy. This article is intended to provide an overview of these recent 
developments. 

In the next section, the algorithmic ideas upon which the method for the Euler equations is 
based will be described very briefly. The developments needed to extend the method to treat viscous 
flow problems will then be described, as will the implementation of the multiblock grid algorithm 
on multiprocessing computers. Results will be presented for laminar and turbulent flows past 
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two-dimensional airfoils, and the parallel-processing results will be presented for solutions to the 
Euler equations for the transonic flow past three-dimensional, swept wings. 

2. ALGORITHMIC ISSUES 

The spatial derivatives in the partial differential equations of fluid flow are represented using the 
finite-volume approximation with adaptive dissipation developed by Jameson et al. [2, 9]. In a 
finite-volume method, the spatial derivatives are approximated by evaluating the net flux across 
the faces of each mesh cell using constant values of the flow properties on each face. In the present 
method, the dependent variables are taken to represent average values for the cells; alternatively, 
they can be interpreted as values of the dependent variables defined at the cell centers. The value 
on each face is taken to be the average of the cells sharing the face. This approximation is equivalent 
to the centered difference scheme that is second-order accurate in the mesh spacing when the mesh 
is sufficiently smooth. 

In order to prevent decoupling of the solution at alternate cells in the grid, dissipative terms 
must be added. Following Jameson [9], the dissipative terms are constructed as an adaptive blend 
of second- and fourth-differences of the solution in each of the mesh directions. In the original 
formulation of Jameson, the dissipative terms are scaled with coefficients that make them inversely 
proportional to the time step corresponding to a unit Courant number for each mesh cell. 
Several researchers [5, 10, 1 1] have found that stability can be maintained while introducing less 
spurious dissipation if the dissipative terms are scaled differently for each mesh direction. This 
strategy allows the use of the minimum dissipation to stabilize the one-dimensional problems in 
each of the coordinate directions, rather than using the largest value for all directions, which is 
approximately what the original Jameson strategy does for large aspect-ratio cells. 

Block ADI methods for the equations of compressible gasdynamics were first introduced by 
Briley and McDonald [12] and by Beam and Warming [13]. The basis of these methods is to 
approximate the spatial derivatives as weighted averages of differences taken at the old and new 
time levels, linearizing the changes in the flux vectors in time, and then to approximate the implicit 
operator as a product of one-dimensional factors. The linearization in time introduces the 
Jacobians of the transformed flux vectors with respect to the solution. The elements of these 
matrices can be expressed explicitly as functions of the solution and the elements of the Jacobian 
matrix of the coordinate transformation, and are given by Warming et al. [14] and Chaussee and 
Pulliam [4], 

It is important to include the contributions of the dissipative terms in the implicit operator. 
Jameson and Yoon [3] suggested choosing the coefficients of the numerical dissipation in the 
implicit operator in a way that forces the amplification factor in a linear stability analysis of 
the scheme applied to a scalar model equation to tend to zero in the high-wave number limit. 
Caughey’s results [5] have shown no clear advantage to this choice; setting the numerical dissipation 
coefficients in the implicit factor equal to those in the residual seemed to work just as well. 

Although the resulting system of equations is linear, it has too large a bandwidth for practical 
solution of problems in more than one space dimension. To improve the efficiency of the scheme, 
the implicit operator is further approximated as a product of one-dimensional factors. Thus, to 
advance the solution one time-step requires the solution of a block pentadiagonal system along each 
coordinate line in each mesh direction. The size of the blocks is 4 x 4 for two-dimensional problems 
and 5 x 5 for three-dimensional problems. 

Such an ADI scheme would be reasonably efficient were it not necessary to add numerical 
dissipation to stabilize what is effectively a central-difference approximation. In fact, if only 
second-difference dissipative terms were added, it would still be necessary to solve only block 
tridiagonal systems. The inclusion of fourth-differences, however, is essential if the solution is 
ultimately to converge to a steady state, and it is important to treat these differences implicitly 
if the solution is to converge rapidly [15]. This can be done in a straightforward manner, following 
the development above, but leads to a requirement to solve block pentadiagonal systems for 
each factor. This requires approximately twice the computational labor of the block tridiagonal 
solutions, and begins to become computationally prohibitive. 

An alternative is to diagonalize the equations at each mesh point, yielding a decoupled set of 
equations, each of which can be solved using a scalar pentadiagonal solver. This requires 
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approximately one-quarter the computational labor of the block pentadiagonal solution (and 
requires, in fact, only about half the work required to solve the block tridiagonal systems). The cost 
of determining the elements of the modal matrices of the Jacobians required to perform the 
diagonalization is about twice that of computing the elements of the Jacobian matrices in the block 
method, but even with the extra matrix-vector multiplications required to perform the diagonaliza- 
tion, the diagonalized procedure is considerably more efficient than solution of the block 
pentadiagonal systems. The relative advantage of the diagonal scheme is even greater on vector 
computers, since the determination of the elements of the modal matrices (and of their inverses) 
and the additional matrix-vector multiplications are all easily vectorizable. 

The treatment of the explicit boundary conditions in the far field follows that of Jameson [9], 
based upon the Riemann invariants of the one-dimensional problem normal to the boundary. The 
solutions computed to date have involved only subsonic free stream Mach numbers, so the far field 
boundaries have either subsonic inflow or outflow. It is necessary to compute the solution on these 
boundaries, therefore, using combinations of free stream values and those extrapolated from the 
interior, the precise mix depending upon the direction of propagation of the relevant characteristics. 

At the body surface, only the pressure is required since the contravariant velocity component 
normal to the boundary is identically zero there. The pressure at the body surface is determined 
from the normal momentum equation, using the formula proposed by Rizzi [16]. As a result of 
the diagonalization of the ADI scheme, it is straightforward to treat the implicit boundary 
conditions in a manner consistent with the characteristic theory. The intermediate and final 
corrections are approximations to the changes in the vectors of characteristic variables for the 
one-dimensional problems along the lines being solved. The boundary conditions for corrections 
to those elements corresponding to characteristics entering the domain are taken to be homo- 
geneous Dirichlet, while those for elements corresponding to characteristics leaving the domain are 
taken to be homogeneous Neumann. 

The incorporation of the scheme within the multigrid algorithm is straightforward, following 
the procedure developed by Jameson [1], An auxiliary mesh is defined by eliminating every second 
line of the fine grid, effectively doubling the mesh spacing in each direction. Values of the flow 
variables are restricted to the coarser grid using area-(or volume-)weighted averages of the solution 
on the fine grid. It is important that the corrections on the coarser grid be driven by the residual 
computed on the fine grid, so a forcing function is defined to account for corrections to the solution 
computed on the coarse grid. After corrections have been computed on the coarser grid, the process 
is continued to still coarser grids, again being careful to ensure that the corrections computed are 
driven by the residuals restricted from the fine grid. After corrections have been computed on the 
coarsest grid, they are prolonged back to successively finer grids using bilinear interpolation in the 
computational coordinates. The addition of corrections to the solution on the finest grid completes 
the multigrid cycle. 

It is usually necessary to update only the body-surface boundary conditions on coarser grids, 
although updating the far field boundary conditions as well does not impede convergence. For the 
present scheme, the overall convergence rates seem to be relatively insensitive to the number of time 
steps (smoothing iterations) performed at each stage of the multigrid cycle. The best asymptotic 
rates are obtained using a fixed “sawtooth” cycle, in which one or two time steps are performed 
on the finest, and on each coarser grid as the grid is coarsened, but no smoothing is performed 
on the coarser grids after corrections have been added. Since the computational work required per 
time step is very nearly proportional to the number of grid cells, the work per time step on each 
coarser grid is approximately one-quarter (one-eighth) that on the previous grid for a two- 
dimensional (three-dimensional) problem. Thus, for the above simple “sawtooth” cycle strategy 
with one time step on each grid level, one multigrid cycle for a two-dimensional (three-dimensional) 
problem requires slightly less than 4/3 (8/7) Work Units, if a Work Unit is defined as the amount 
of computation required for one time step on the fine grid. 

Since the smoothing characteristics of the time-stepping algorithm are more important than 
accuracy on the coarser grids, Jameson [1] has found it desirable to use only a fixed-coefficient, 
second-difference form of the dissipation on coarser grids. In the ADI formulation described here, 
this allows additional efficiency to be achieved by using a scalar tridiagonal solver on all but the 
finest grid level. 
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3. APPLICATIONS 

As described above, the diagonalized ADI multigrid algorithm has been applied to calculate 
transonic flows past airfoils [5, 17] and in two-dimensional inlets [6]. It has been used to compute 
the inviscid flow past swept wings [7] and implemented on multiblock grids [8, 18]. In this section, 
more recent applications to viscous flows, including high Reynolds number turbulent flows, 
are described, including an improved multigrid strategy for the multiblock implementation on 
multiprocessing computers. 

3.1. Viscous flows 

The extension of the diagonalized method to treat viscous problems is complicated primarily by 
the fact that the viscous Jacobian matrices are not, in general, diagonalized by the same similarity 
transformations that diagonalize the Euler flux Jacobians. Thus, if the diagonal form is to be 
maintained, the contributions of the viscous terms to the implicit factor must either be approxi- 
mated or be neglected altogether. While the neglect of the viscous terms in the implicit factor is 
permissible in some cases, there are also situations in which at least approximate representations 
of these terms must be included for stability. 

For many cases of practical interest, it is sufficient to consider only the thin-layer form of 
the Navier-Stokes equations. In this form of the equations, only those viscous stresses acting on 
the grid surfaces approximately parallel to the body surface are included. The contributions of the 
viscous and heat-conduction terms to the residuals on the right-hand side of the equations are 
approximated by finite-volume formulas that are equivalent to a centered finite-difference approach 
[19]. The primary focus of the discussions here will be on the treatment of the terms in the implicit 
operator. 

For the case in which the transport coefficients are assumed locally to be constant, the 
linearization of the change in the viscous flux vector can be written in terms of the normal derivative 
of the Jacobian of the viscous flux vector with respect to the first derivative of the solution vector 
in the normal direction [20], The eigenvalues of this viscous Jacobian are real and distinct, although 
three of the four values (in two dimensions) are equal to within plus or minus about 30% for typical 
values of the specific heat ratio and Prandtl number [20]. 

This information suggests three approaches for treating the contributions of the viscous terms 
to the implicit operator for a diagonalized scheme: 

(A) Neglect the viscous contributions to the implicit operator altogether. 

(B) Include a diagonal approximation to the viscous contributions. 

(C) Include a third diagonalized factor to incorporate the viscous contributions. 

The latter option is possible since the eigenvalues of the viscous Jacobian are real and distinct. 
A separate factor is required since the transformation required to diagonalize the viscous 
Jacobian is, in general, different from those which diagonalize the Euler flux Jacobians. The 
diagonal approximation includes the identity matrix, multiplied by the largest of the eigenvalues 
of the viscous Jacobian, as the coefficient of the contributions of the viscous differences to the 
implicit operator. 

Analysis of these alternatives for a representative model problem [21] has shown that 
Methods A and C are only conditionally stable, with the stability region of Method C being 
somewhat larger than that of Method A; Method B is unconditionally stable for the model 
problem. 

Results will be presented here for a single calculation, that of the symmetric laminar flow at a 
Reynolds number of Re = 5000 past the NACA 0012 airfoil at 0° angle of attack and a free stream 
Mach number of M ro = 0.50. The grid for this calculation has a “C”-topology, containing 192 x 48 
cells in the wrap-around and body-normal directions, respectively, and extends in the far field to 
approx. 7.5 chord lengths from the mid-point of the airfoil chord. At the airfoil surface the normal 
spacing of the mesh is approx. 1.1 x 10“ 3 chords. The converged pressure distribution for this 
solution is shown in Fig. 1. 

On this grid, and for these flow conditions, the multigrid iteration diverges when local time 
stepping is used at a Courant number of CFL = 16.0 and the contribution of the viscous terms 
is neglected in the implicit operator: the solution begins to converge initially, until the residual is 
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FL02D: Implicit Viscous Thinlayer 
Mach 0.500 Alpha 0.000 Prat -1.000 

Cl 0.0000 Cd 0.0496 Cm 0.0000 

Grid 192x48 Work 299.53 Res0.952E-08 



FL02D: Implicit 
Mach 0.500 
Resi 0.339E-03 
Res2 0.952E-08 
Work 299.53 


scous Thinlayer 
Alpha 0.000 
Re 5000. 

Fbc -1. 

Rate 0.9656 


Prat -1.000 
CFL 16.00 
Grid 192x48 
Nmesh 5 


Fig. 1. Pressure distribution for viscous flow past a NACA Fig. 2. Iteration history for calculation of viscous flow past 

0012 airfoil at M„ = 0.50 and zero angle of attack; a NACA 0012 airfoil for the conditions of Fig. 1; five levels 

Re = 5000. of multigrid with local time stepping at CFL = 16; diagonal 

approximation of the viscous terms is included in the 
implicit operator (Method B). 


reduced about two orders of magnitude, but then an oscillation develops of increasing amplitude. 
This oscillation resembles a periodic shedding of vortices of opposite signs from the airfoil 
trailing edge, but the solution to this problem is expected to be steady. In fact, when an 
approximation to the viscous terms is included in the implicit factor, corresponding to Method B, 
the iteration converges. The iteration history for this case is shown in Fig. 2. Plotted are the 
logarithm of the average over all cells of the residual of the continuity equation | Ap/At |, the total 
number of grid cells in which the local Mach number is supersonic and the lift and drag coefficients, 
as a function of Work Units. The latter three quantities are plotted on arbitrary scales. Of course, 
for this subcritical, non-lifting case the number of supersonic cells and the lift coefficient are both 
zero. 

3.2. Turbulent flows 

The computation of high Re flows of practical interest requires both, the incorporation of a 
turbulence model to relate the effective Reynolds stresses to the properties of the mean flow and 
the ability of the algorithm to perform well on highly-stretched grids having very large geometric 
aspect ratios. In the near future, improved turbulence models will be investigated, but here the goal 
is to demonstrate good convergence characteristics for the basic multigrid algorithm on the grids 
required to resolve high Re turbulent flows. 

With this goal in mind, results are presented of a calculation using the simple two-layer algebraic 
turbulence model of Baldwin and Lomax [22], Results of diagonalized ADI multigrid calculations 
for several airfoils and flow conditions have been given by Varma and Caughey [23]; one of these 
solutions is repeated here. The flow past the RAE 2822 airfoil at M*, = 0.725 and 2.5° angle of 
attack, at Re c = 6.5 x 10 6 is computed on a grid containing 192 x 48 mesh cells in the wrap-around 
and body-normal directions, respectively. The computed airfoil surface pressure distribution is 
compared in Fig. 3 with the data of Cook et al. from the AGARD compendium of test cases [24], 
The agreement is quite good, although the angle of attack has been adjusted to match the lift 
coefficient of the calculation with that of the experiment (in which the angle of attack was given 
as 2.92°). Also, for cases in which the shock wave causes significant boundary layer separation, 
results using simple algebraic turbulence models are generally not as good as for this unseparated 
case [23, 25], 
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-i . 5000 r 



RAE 2822 AIRFOIL 

Mach .725 Alpha 2.500 Re 6.500E+06 
Present Work — Cook et al.[1979] ° . » 

Fig. 3. Surface pressure distribution for an RAE 2822 airfoil 
at M„ = 0.725 and 2.50° angle of attack; grid contains 
192 x 48 mesh cells in the wrap-around and body-normal 
directions, respectively. 



0 . 100 . 200 . 300 . 400 . 500 . 600 . 

RAE 2822 AIRFOIL 

Mach .725 Alpha 2.920 Re .650E+07 

Real .154E-02 CFL 24.00 

Res2 .464E-09 Grid 192x4B 

Work 500.19 Rate .9704 Nmesh 4 

Fig. 4. Iteration history for calculation of turbulent flow 
past an RAE 2822 airfoil for the conditions of Fig. 3; four 
levels of multigrid with local time stepping at CFL = 24. 


A plot of the convergence history for this case is shown in Fig. 4; the same variables are 
plotted as in Fig. 2. Using four levels of multigrid and a grid sequencing strategy (in which initial 
estimates for the solution are computed on coarser grids), the solution has converged to a steady 
state within approx. 50 Work Units; the average residual has been reduced about 7 orders of 
magnitude in 500 Work Units. 

In closing this section, it should also be mentioned that multigrid calculations using Jameson’s 
combination of explicit time-marching and implicit residual smoothing have also been performed 
by Martinelli et al. [26, 27] for two-dimensional flows and by Jayaram and Jameson [28] for flows 
in three dimensions. 

3.3. Multiprocessing multigrid on multiblock grids 

The implementation of the diagonalized ADI multigrid algorithm to solve the Euler equations 
on multiblock grids has been described by Yadlin and Caughey [8]; the extension to include 
the viscous terms of the thin-layer approximation to the Navier-Stokes equations has been 
reported by Yadlin et al. [29]. The former paper also reported experiments using multiprocessor 
computers in which the multigrid was implemented in two different modes: (1) a horizontal strategy, 
in which the multigrid cycles are advanced in phase in all the blocks; and (2) a vertical strategy, 
in which the multigrid cycles are advanced independently in each of the blocks. The principal 
difference between these two modes is the degree of interaction between the blocks during the 
multigrid cycle. In the horizontal mode, all of the blocks are in phase during the cycle, hence 
the data exchange between the blocks (i.e. the updating of boundary conditions on the inter-block 
boundaries) can be done easily at each level in the cycle. On the other hand, in the vertical 
mode the blocks are synchronized only at the beginning of each cycle, allowing for data exchange 
only once in the cycle, resulting in a freezing of the boundary conditions on the interfaces during 
the entire cycle. It is desirable from the standpoint of achieving high efficiency in the multi- 
processor environment to require as little synchronization as possible. Thus, the vertical mode is 
preferred. Also, the vertical mode allows much greater flexibility in constructing multigrid 
cycles — e.g. allowing different numbers of grid levels in different blocks. However, the implemen- 
tation of the vertical mode described above has been shown to result in poor iterative convergence 
rates due, most probably, to the fact that the interface boundary conditions on the coarse grids 
are not correct [8]. 
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One way to improve this updating scheme is to use an asynchronous updating, in which the 
interface boundary conditions are updated with the latest available data from adjacent blocks. 
That data may be new or old, depending upon the current stage of the multigrid process in the 
adjacent block. This asynchronous updating of the inter-block boundary conditions has been 
implemented using a set of surface arrays, that act as buffers for information transfer between the 
blocks. Each block has a surface array that holds a copy of the solution vector in the two layers 
nearest the boundaries of the block; data from this array is read into the layers of dummy cells 
of the adjacent blocks. 

Using these surface arrays, advancing the solution one time step on any grid level of the multigrid 
cycle involves: 

1 . Updating the interface boundaries by reading data from the surface arrays of the 
adjacent blocks. 

2. Advancing the solution within the block one time step. 

3. Writing the appropriate layers of the solution into the block surface arrays. 

A similar sequence of steps is performed in the interpolation step of the multigrid cycle. 

The order in which the blocks are updated dictates which of the blocks will use surface arrays 
with updated data and which ones will use old data. (A locking mechanism is provided when this 
algorithm is implemented on parallel computers to prohibit a block from reading from a surface 
array at the same time that another block is writing to the same array.) It is worth noting that 
the order of updating can affect the intermediate values of the solution (e.g. time accuracy is not 
maintained, and asymmetries may be introduced into solutions that ultimately converge to 
symmetric solutions), but the interest here is only in achieving rapid convergence to the steady-state 
solution. 

The result presented here is for the transonic flow past the ONERA wing M-6 [24]; M rj0 = 0.839 
and the angle of attack is 3.06°. The Euler equations are solved on a grid containing 192 x 32 x 32 
cells in the wrap-around, body-normal and span-wise directions, respectively. The reference grid 
is divided into eight blocks. In order to verify that the inter-block boundary conditions were being 
properly treated in the limit of the steady solution, the four blocks containing the wing and wake 
were further subdivided into two sets of blocks, having a common interface in the middle of the 
wing. Contours of constant pressure on the wing upper surface for this case show no visible effects 
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ONERA WING M8(8 B LOCK, Vert, msurf = 1 

Mach .838 Alpha 3.060 

Real .223E+00 CFL 12.00 

Ree2 .115E-02 arid 192x32x32 

Work 149.00 Rate .9853 Nmeah 1 

Fig. 5. Convergence history for the diagonal implicit scheme 
on a multiblock grid for flow past an ONERA M-6 wing at 
= 0.839 and 3.06° angle of attack; local time stepping 
without multigrid at CFL = 12. 


ONERA WING M6(12 BLOCK, Vert, msurf =1) 
Mach .839 Alpha 3.080 

Reel .226E+00 CFL 12.00 

Res2 .197E-04 Grid 192x32x32 

Work 149.82 Rate .9395 Nmeeh 4 

Fig. 6. Convergence history for the diagonal implicit scheme 
on a multiblock grid using four levels of multigrid for flow 
past an ONERA M-6 wing at M„ = 0.839 and 3.06° angle 
of attack; local time stepping at CFL = 12. 
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of the inter-block boundary, even where the shock waves cross this artificial surface; in addition, 
the lift and drag coefficients computed on this grid are identical to those for the eight-block grid, 
which does not have the extra inter-block boundaries in this high-gradient region. 

The convergence histories for this case are presented in Figs 5 and 6. These figures show the 
same variables as plotted earlier for calculations on the multiblock grid in vertical mode without 
multigrid and when using four levels of multigrid. The effect of multigrid on the convergence rates 
is clear: using four levels of multigrid, the lift coefficient (a good measure of global convergence) 
has converged to within plottable accuracy of its final value in fewer than 40 Work Units while 
without multigrid more than 150 Work Units is required; with multigrid the average residual 
has been reduced by about four orders of magnitude in 1 50 Work Units, while without multigrid 
a reduction in error of only two orders of magnitude was achieved in the same amount of work. 

Calculations for the same test case using the horizontal mode demonstrate that there is very little, 
if any, degradation in performance caused by use of the vertical mode. This is in distinct contrast 
to earlier results (presented in Ref. [8]), in which the boundary conditions on the inter-block 
boundaries were frozen during the entire multigrid cycle. In those earlier calculations, there was 
a significant degradation of convergence rate for the vertical mode. 

4. CONCLUDING REMARKS 

A multigrid implementation of a diagonalized ADI algorithm to solve the Euler equations 
of inviscid, compressible flow has been reviewed. Recent extensions demonstrate that the 
algorithm is also effective for solving the Navier-Stokes equations, including high Re turbulent 
flows, and an improved vertical multigrid strategy for implementation of the algorithm on multi- 
processing computers using multiblock grids allows greater parallel efficiencies to be achieved on 
multiprocessing computers. 
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